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Abstract 

To be practi cal , recognition systena rust deal with uncertainty Positions of inage 
features in scenes vary Features sonatinas fail to appear because of unfavorable illu- 
mnation. In this w>rk, nathods of statistical inference are corianed wtherprical 
nadels of uncertainty in order to evaluate andrefine hypotheses about the occurrence 
of a knowi object i n a scene. 

Robabilistic nadels are used to characterize inage features and their correspon- 
dences. A statistical approach is taken for the acquisition of object nadels from 
observations in i rages: Mean Edge Images are used to capture object features that 
are reasonabl y stabl e wth respect to vari ati ons i n i 1 1 urinati on. 

The Aignnant approach to recognition, that has been described by Hut tenlocher 
and Ul 1 nan, i s used The nachani sna that are enpl eyed to generate i ri ti al hypothe- 
ses are distinct fromthose that are used to verify (andrefine) them In this w>rk, 
posterior probability and Maxi mmLi kel i hood are the criteria for evaluating and 
refining hypotheses. The recognition strategy advocated in this w>rk ray he sum 
mrizedas Align Refine Verify } whereby 1 ocal search i n pose space i s utilized to refine 
hypotheses fromthe alignmnt stage before verification is carried out. 

Tw> formlations of mdel - based obj ect recognition are described MP Mdel 
Mtcling evaluates joint hypotheses of natch and pose, wile Posterior Mrginal 
Pose Bti rati on evaluates the pose only local search in pose space is carried out 
wth the Fxpectation-Mxinhzation (EVt algorithm 

Reccgnitionexperimnts are described were the FAdalgorithmis used to refine 
and evaluate pose hypotheses in2Dand3D Initial hypotheses for the 2Dexperimnts 
wre generated by a sirple indexing nathod Agle Pair Indexing. The linear 
(oiii nation of View nathod of III ran and Basri is enpl eyed as the projection 
mdel in the 3Dexperimnts. 
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Chapt e r 1 



Int roduct ion 



Vsual object recognition is the focus of the research reported in this thesis, ftcogri- 
ti on rnst deal wth uncertai nty to be practi cal . rosi ti oris of i rage features bel orgi rg 
to objects in scenes vary. Eatures sonatinas fail to appear because of unfavorable 
illunbnation Inths vork, nathods of statistical inference are ccnMnedwthenpir- 
ical mdels of uncertainty in order to evaluate hypotheses about the occurrence of a 
knowi object in a scene. Qher problena, such as the generation of initial hypotheses 
and the acquisition of object rodel features are also addressed 



1.1 The Problem 

Kpresentati ve reccgri ti on probl era and tbei r sol uti ons are i 1 1 ustrated i n E gures 1- 1 
and 1-2. The probl emis to detect and locate the car in dgitized video inages, using 
previouslyavailabledetailedinfornati on about the car. Inthese figures, object rodel 
features are superi rposed over the vi deo i rages at the posi ti on and ori entati on where 
the car ras found Egure 1-1 shovs the results of 2Dreccgrition, -while Egure 1-2 
illustrates the results of 3Dreccgrition These inages are fromexperinants that are 
described in Chapter 10. Eactical solutions to problena like these wll inprove the 
flexibility of robotic systena. 

11 
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CHAPTER 1. INTRODUCTION 




B gure 1- 1: Rpresentati re Rxogri ti on Roli emand Sol uti on (21) 




E gure 1- 2: ftpresentati re Rxogri ti on Roli emand Sol uti on (31) 
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In tli s w>rk, the recogri ti on probl eni s restri cted to find ng occurrences of a si ngl e 
object in scenes that nay contain other unknown objects, Etepite the sinpiification 
and years of research, the probl emrenai ns 1 argel y unsol ved Ibbust system that 
can recogri ze smoth obj ects havi ng si x degrees of f reecbmof posi ti on, under varyi ng 
condtions of illurination, occlusion, and background, are not cornarci al 1 y avai 1 abl e. 
Meh eflbrt has been expended on tli s probl emas is evident in the conprehensi ve 
review of research in corputer- based object recognition by Efesl and Jain [5], who 
cited 203 references, and (hi n and L^er [18], who cited 155 references. The goal of 
this thesis is to characterize, as 'veil as to describe howto find, robust solutions to 
visual object recognition probl era. 

1. 2 The i^pproach 

In this w>rk, statistical nathods are used to evaluate and refine hypotheses in object 
recognition Agle lair Indexing, a mans of generating hypotheses, is introduced 
These rachanism are used in an extension of the Aignrant mthodthat includes a 
pose refinemnt step. Ken of these corponents are anpl i fied bel ow 

1.2.1 Sbctisticcl ^pcedi 

Intli s research, vi sual obj ect recogri ti oni s approached vi a the pri nci pi es of Mxi mm 
likelihood (MI) and MxirnmAIosteriori probability (WP). These principles, 
along wth specific probabilistic rodels of aspects of object recognition, are used to 
deri ve obj ecti ve f uncti ons for eval uati ng and refiri ng recogri ti on hypotheses . The MI 
and MPcriteri a have a long history of successful applicationinforrnlating decisions 
and in raking esti rates fromobserved data They have attractive properties of 
optirality and are often useful when raasuremnt errors are significant. 

In other areas of corputer vision, statistics has proven useful as a theoretical 
frarawffk The w>rk of Yuille, (Siger and Blthoff on stereo [78] is one exarple, 
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whi 1 e i n i rage restorati on the w>rk of Gkan and Gkan [ 28] , Mrroqii n [ 54] , and 
Mrroqiin, Witter andloggio [55] are others. The statistical approach that is used 
in this thesis concerts the recogritionprolieninto a 'veil defined (alt hough not nec- 
essarily easy) optinhzati on problem This has the advantage of provi di ng an expl i ci t 
characterization of the problem while separating it fromthe description of the algo- 
ri thm used to sol re i t . Al hoc obj ecti ve f uncti ons have been profit aH y used i n som 
areas of corputer vision Such an approach is used by Brnard in stereo natcling 
[ 2] , B ake and Zi ssernan [7] i n i mge restorati on and ftveri dge, W ss and Rsemn 
[6] i n 1 i ne segmnt based mdel natching. Whtlis approach, pi ausi bl e f arm f or 
corponents of the objective function are often corM red using trade-off paranaters. 
Such trade- off paranaters are det er rimed erpri call y Ai advantage of deriving ob- 
jective functions fromstatistical theories is that assumptions becona explicit - the 
f orna of the obj ecti ve f uncti on corponents are cl earl y rel ated to sped fie probabi 1 i sti c 
nadels. If these mdels fit the cbnain then there is sona assurance that the resulting 
criteria wll performvell. Asecond advantage is that the trade- off paranaters in the 
objective function can be deri ved f rommasurabl e statistics of the cbnain 

122 FeEture- Based Rocgiticn 

This rorkuses a feature- based approach to object recognition features are abstrac- 
tions like points or curves that sunnarize sona structure of the patterns inanimge. 
There are several reasons for using feature based approaches to object recognition 

• features can concisely represent objects and i rages, features derived from 
brightness edges can sunnari ze the irportant events of an i rage in a way that 

is reasonably stable wth respect to scene illurination 

• In the al i gnnant approach to reccgri ti on (to be descri bed shortl y) , hypotheses 
are veri fed by proj ecti ng the obj ect mdel i nto the i rage, then corpari ng the 
predi cti on agai nst the i rage. By usi ng corpact , feature- based representati ons 
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of the object, projection costs ray be kept low 

• Katures also facilitate hypothesis generation. Indexing rathods are attractive 
mcharism for hypothesis generation Such mthods use tables indexed by 
properties of snail groups of inage features to quickly locate corresponding 
mdel features. 

Object Rat ires fromOteervation 

A raj or issue that rust be faced in mdel -based object recognition concerns the 
origin of the object mdel itself. The object features that are used in this w>rk are 
deri ved f romactual i rage observati ons . Thi s mthod of feature acqui si ti on autorat- 
ically favors those features that are likely to be detected in i rages. The potentially 
diffiult problemof predicting irage features fromabstract geomtric mdels is by- 
passed This prediction pr obi emis ranageable in som constrained cbrains (wth 
polyhedral objects, for instance) but it is often diffiult, especially wth smoth ob 
jects, lowresoluti on i rages and lighting variations. 

Br robustness, sirple local irage features are used in this w>rk Features of this 
sort are easily detected in contrast to extended features like line segmnts. Extended 
features have been used i n som systena f or hypothesis generation because their ad- 
ditional structure provides rare constraint than that offered by si rpl e local features. 
Extended features, nonetheless, have dravbacks in being diffiult to detect due to 
occlusions and localized failures of irage contrast, ftcause of this, systena that rely 
on distinguished features can lose robustness. 

123 Aigmi 

rypothesize- and- test, or alignrmnt mthods have proven effective in visual object 
recognition Hittenlocher and II Iran [43] used search over ririral sets of corre- 
sponding features to establish candidate hypotheses. In their wxkthese hypotheses, 



16 CHAPTER 1. INTRCDUCTTCN 

or a\ i gnments, are tested by projecting the object mdel into the inage using the 
pose (posi ti on and ori entati on) i npii ed by the hypotbesi s, and then by pert orring a 
detailed conpari son wth the inage. The basic strategy ol the alignnant mthodis 
to use separate mcharism lor generating and testing hypotheses. 

ftcently indexing nathods have becona available lor effiiently generating hy- 
potheses i n recogri ti on. These nathods aw>i d a si gni leant amunt ol search by usi ng 
pxe-corputed tables lor looking up the object leatures that right correspond to a 
group ol inage leatures. The geonatric hashing mthodol Iandanand Wlson[49] 
uses invariant properties ol snail groups ol leatures under affile translornations as 
the look-up key Oenans and Jacobs [19] [20], and Jacobs [45] described indexing 
nathods that gai n effii ency by usi ng a leature grouping process to select snail sets 
ol inage leatures that are likely to belong to one object in the scene. 

In this w>rk, a si npl e lorrrol 2Dindexing, Angl e Pai r Indexi ng } is usedto generate 
initial hypotheses. It uses an inwiant property ol pairs ol inage leatures under 
translation, rotation and scale. This is described in Chapter 9. 

The Hugh translorm[40] [44] is another cornady used nathod lor generating 
hypotheses in object recognition In the Hugh nathod, leature- based clustering is 
perlornad in pose space, the space ol the translornations describing the possible 
nation ol the object. This nathod was used by Ginaon and Iozano-Krez [36] to 
1 ocal i ze the search i n recogri ti on 

These last nathods ol hypotbesi s generati on provi de ongoi ng reasons lor usi ng the 
alignnant approach They are olten met effective when used in conjunction wth 
verification \grification is important because indexing nathods can be susceptible 
to table collisions, wile H>ugh nathods sonatinas generate false positives due to 
thei r aggregati on of i nconsi stent evi dence i n pose space bi ns . Tti s 1 ast poi nt has been 
argued by Gi naon and Hit ted ocher [ 35] . 

The usual alignnad strategy ray be surnari zed as align verify. Aignnant and 
verification pi ace di fieri ng pressures on the choice ol leatures lor recognition Mcb 
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ami sma used for generati ng hypotheses tyri cal 1 y have corputati onal compl exi ty that 
is polynomial in the nunher of features i molded. Rcause of this, there is sigrificant 
advantage to using lowresoluti on features - there are fever of them Ihfortunately 
pose esti nates based on coarse features tend to be less accurate than those based on 
hi gh resol uti on features . 

Iikewse, veri ficati on is usually rare reliable wth high resol uti on features. This 
approach yi el ds rare detailed comparisons. These differing pressures nay be accom 
mdatedby enploying coarse- fine approaches. The coarse- fine strategy was utilized 
successfully in stereo by Grimon [33] . In the coarse- fine strategy, hypotheses de- 
rived fromlowresol uti on features li imt the search for hypotheses deri ved f romli gh- 
resol uti on features. There are sona potential dffiulties that arise when appl yi ng 
coarse- fine ratbods in conjunction wth 3Dobject mdels. These nay be avoided 
by using view based alternatives to 3Dobject mdeling These issues are dscussed 
rare fully in Chapter 4. 



Aign Rfine Verify 

The reccgri ti on strategy advocated i n thi s w>rk ray be sumoari zed as al i gn refine 
verify. Thi s approach has been used by li peon [ 50] in refining alignmnts. The key 
observation is that local search in pose space ray be used to refine the hypothesis 
fromthe alignmnt stage before verification is carried out. In hypothesize and test 
ratbods, the pose esti rates of the initial hypotheses tend to be somewhat inaccurate, 
since they are based on riri ral sets of correspond ng features. litter pose estimates 
(hence, better verifications) are likel y to result frorrusing all supporting i rage feature 
data, rather than a snail subset. Chapter 8 describes aratbodtbat refines the pose 
esti rate wile simultaneously identifying and incorporating the constraints of all 
supporting image features. 
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1. 3 Gui de to Thesi s 

Hiefiy the presentation of the mterial in this thesis is essentially bottomup. The 
early chapters are concerned wth building the corponents of the fornilation, while 
the rain contributions, the statistical forrdations of object recognition, are de- 
scribed in Chapters 6 and 7. Ater that, related algorithm are described, followd 
by experi rant s and concl usi ons . 

In mre detail, Chapter 2 describes the probabilistic mdels of the correspon 
dences, or rapping bet-veen irage features and features belonging to either the ob- 
ject or to the background These rodel s use the pri nci pi e of naxi nnmentropy where 
little inforrationis available before the irage is observed In Chapter 3, probabilis- 
tic mdels are developed that characterize the feature detection process. Hprical 
evi dence i s descri bed to support the choi ce of mdel . 

Chapter 4 discusses a ray of obtaining average object edge features froma se- 
quence of observati ons of the obj ect i n i rages . Bterriri sti c mdel s of the proj ecti on 
of features into the irage are discussed in Chapter 5. The proj ecti on mt hods used 
in this rorkare linear in the paramters of the transformations. Mtbods for 2Dand 
3Dare discussed, including the linear (onH nation of Vew mtbodof III ran and 
Rsri [71]. 

In Chapter 6 the above mdels are corttnedinaByesianfranaw)rkto construct 
a criterion, AAP Mdel Mtchi ng, for evaluating hypotheses in obj ect recognition 
In this formlation, corplete hypotheses consist of a description of the correspon 
dences bet-vcenirage and obj ect features, as -veil as the pose of the object. These 
hypotheses are evaluated by their posterior (after the irage is observed) probability 
Areccgri ti on experi rant i s descri bed that uses the cri teri a to gui de a heuri sti c search 
over correspondences. Aconnecti on betwen MP Mdel Mtchi ng and a mtbod of 
robust chanfier mtchi ng [47] is described 

Raiding on the above, a second criterion is described in Chapter 7: Poster i or 
Mar gi rial Pose Est i rmt i on (PMf). Hre, the solution being sought is sinply the 
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pose of the object. The posterior probability of poses is obtained by taking the 
fornal narginal, over all possible natcbes, of the posterior probabi 1 i ty of the joint 
hypotheses of MPMdel Mtcling. This results inasmoth, non- linear objective 
function for evaluating poses. The smothness of the objective function facilitates 
local search in pose space as a nachanismfor refining hypotheses in recognition. 
Som experi rental expl orati ons of the obj ecti ve f uncti on i n pose space are descri bed 
These characteri zati ons are carri ed out i n t w> cbnai ns : vi deo i nagery and syntheti c 
radar range i nagery 

Chapter 8 descri bes use of the the Ecpect at i on- Mxi m zat i on ( M)[ al gori thm[ 21] 
for finding local mxiraof the HvBEobj ecti ve function. This algorithmalternates 
bet-veen the Mstep - a weighted least squares pose estirate, and the Estep - re- 
calculation of the weights based on a saturating non- linear function of the residuals. 

Thi s al gori thni s used to refine and eval uate poses i n 2Dand 3Dreccgri ti on ex- 
peri rants that are descri bed i n Oapter 10. Ini ti al hypotheses for the 2Dexperi rants 
were generated by a si rple indexing rathod, Angle Pair Indexing, that is described 
in Chapter 9 . The linear GkM nation of Vew rathod of IIlmnandRsri [71] is 
enployedas the projectionradel in the 3Dexperi rants reported there. 

Enally sore conclusions are drawi in Oapter 11. The notation used throughout 
i s surnari zed i n impend x A 
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Chapter 2 



Modeli ng Feat ur e Correspondence 



This chapter is concernedwthrxobalilistic mdels of feature correspondences. These 
mdels wll serve as priors in the statistical theories of object recognition that are 
described in Chapters 6 and 7, and are important corponents of those formlations. 
They are used to assess the probability that features correspond before the i rage data 
is corpared to the object rodel. They capture the expectation that sona features 
i n an i rage are anti ci pated to be due to the obj ect 

Three different mdels of feature correspondence are described, one of -which is 
used i n the reccgri ti on experi rants descri bed i n Chapters 6, 7, and fO. 



2. 1 Feat ires and Correspondences 

This research focuses on feature- based object recognition. The object being sought 
and the imge being analyzed consist of discrete features. 

let the imge that is to be analyzed be represented by a set of v- di ransi onal 
point features 

Y = {Y 1 ,Y 2 ,...,Y n } , Y t eR v . 

Imge features are discussed in rare detail in Chapters 3 and 5. 

21 
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The obj ect to be reccgri zed i s al so descri bed by a set of features, 

M={M 1 ,M 2 ,...,M m } . 

The features wll usually be represented by real ratrices. AHtional details on obj ect 
features appears in Chapters 4 and 5. 

In this w>rk, the interpretation of the features in an inage is represented by the 
variable T, -which describes the rapping froni rage features to obj ect features or the 
scene background This is also referred to as the correspondences . 

T={T 1} T 2} ... } T n } , r,-GMJ{±} . 

In an interpretation, each i nage feature, Y ,-, wll be assigned either to sona obj ect 
feature M J} or to the background, wichis denoted by the synfaol _L This synfaol 
plays a rol e si ril ar to that of the null character inthe interpretation trees of Gimon 
and Iozano- Krez [ 36] . Ax i nterpretati on i s i 1 1 ustrated i n E gure 2-1. Ti s a col 1 ecti on 
of variables that is indexed in parallel wth the inage features. Each variable T 
represents the assignmnt of the corresponding inage feature Y ,-. It nay take on as 

value any of the object features M J} or the background, _L Thus, the maiing of the 
expressi on T 5 =M 6 i s that i nage feature five i s assi gned to obj ect feature six, li kew se 
T7 =Lnaans that inage feature seven has been assi gned to the background In an 
interpret at ion each inage feature is assigned, wile sona object features nay not be. 
Akl ti onal 1 y several i nage features nay be assi gned to the sana obj ect feature. Thi s 
representation allow inage interpretations that are irplausible -other racharisrH 
are used to encourage natri cal consi steney 



2. 1 . FEATURES AND OCMESPCNDENGES 



23 




Egure 2-1: Inage Katures, Object Ratures, and G)rrespondences 
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2. 2 Ai Irdeperdert Correspondence Model 

In this section a sinpie probabilistic mdel of correspondences is described The 
i ntent i s to capture sora i nf orrati on beari ng on correspondences before the i rage i s 
corpared to the obj ect . Thi s rodel has been desi gned to be a reasonabl e corp'onhse 
bet wen si rpl i ci ty and accuracy. 

In this rodel, the correspondence status of differing irage features are assured 
to be independent, so that 

p(i)=IIrfr.-)- (2.i) 

i 

lire, j(I) is a probability mss function on the discrete variable T. There is 
evi dence agai nst using statistical i ndependence here, for exanple, occlusionis locally 
correl ated. Independence i s used as an engi neeri ng approxi rati on that si rpl i ties the 
resulting formlations of recognition It ray be justified by the good perforrance 
of the recognition experi rants described in Chapters 6, 7, and 10. lewreccgrition 
systena have used non independent rodel s of correspondence. Betel outlined one 
approach in lis thesis [9]. Arel axation of this assurptionis discussed in the followng 
section 

The corponent probability function is designed to characterize the arount of 
clutter in the irage, but to be otherwse as roncornhttal as possible: 



tfr. 



B if T t =L 
^— - otherwse 



The joint mdel j(I) is the raxi mmentropy probability function that is con 
si stent wththe constraint that the probability of an inage feature belonging to the 
background is B _Bmybe esti rated by taking sinpie statistics on i rages fromthe 
cbrain B=.9 w>uld man that 90 % of irage features are expected to be due to 
the background. 

Hving inconstant during recognition is an approxi rati on The nurier of fea- 



(2.2) 
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tures due to the obj ect wl 1 1 i kel y vary accord ng to the si ze of the obj ect i n the scene. 
_ficoul d be esti rated at recogri ti on ti m by pre- processi ng mchari sm that eval uate 
inage clutter, and factor in expectations about the size of the object. In practice, 
the approximtionw)rks 'well in controlled situations. 

The independent correspondence rodel is used in the experimnts reported in 
this research 



2. 3 AINdrkov Correspondence Mdel 

A indicated above, one inaccuracy of the independent correspondence rodel is that 
sample realizations of T drawi f romthe probability function of Equations 2.1 and 
2.2 wll tend to be overly fragmnted in their mdeling of occlusion. This section 
describes a corprorise rodel that relaxes the independence assumption soravhat 
byallowngthe correspondence status of an inage feature (T 8 ) to depend on that of 

its neighbors. In the cbnainof this research, inage features are fragronts of inage 
edge curves. These features have a natural neighbor relation, adjacency along the 
inage edge curve, that ray be used for constructing a lDMrkov KincbmEeld 
(Nff) nadel of correspondences. MPs are collections of rancbmvari abl es whose 
condi ti oral dependence i s restri cted to 1 i rited si ze nei ghbor hoods . MF rodel s are 
discussed by (Snan and ©ran [28] . The followng describes an MF rodel of 
correspondences intended to provide a rare accurate nadel of occlusion. 



£T)=q(T i)4T 2 )...4T n )r 1 (T 1 ,T 2 )r 2 (T 2 ,T 3 )...r n - 1 (T n - 1 ,T n ) , (2.3) 

where 



«(r, 



ei if Ti =L 
e 2 otherwse 



(2.4) 
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and 



r l (a } b) 



e 3 if a=l_and&=l_ 
e 4 if a^l_and& /=l_ 
e 5 otberwse 



1 



> if features z andz + 1 are neighbors 



otberwse . 



(2.5) 



The assignmnt of indices to inage features should be done in such a ray that 
neighboring features have adjacent indices. The functions r ,-(•, • ) mdel the interac- 

tion of neighboring features. The paramters e i . . . e$ nay be adjusted so that the 

probability function j(I) is consistent wth observed statistics on clutter and fre- 
quency of adjacent occlusions. AMtionally the paramters nist be constrained so 
that Equati on 2. 3 actual 1 y descri bes a probabi 1 i ty f uncti on. Msen these constrai nts 
arenat, the mdel wll be the naxirnrrentropy probability function consistent wth 
the constraints. Satisfying the constraints is anon-trivial selectionproblemthat ray 
be approached i terati vel y Eortunatel y tli s cal cul ati on cbesn' t need to be carri ed out 
at reccgri ti on ti ra. GJ1 dran [ 30] di scusses nathods of cal cul ati ng these paramters . 

The mdel outlined in Equations 2. 3 - 2. 5 is a generalization of the Ising spin 
mdel. Ising mdels are used in statistical physics to mdel ferromgnetism[73] . 
Sarnies drawifromlsing mdels exhibit spatial cluming -whose scale depends on 
the paramters. In object recognition, this cluming behavior ray provide a rare 
accurate mdel of occlusion. 

The standard Ising mdel is showifor reference i n the foil owng equations. It has 
been restri cted to IE) and has been adapted to the notati on of tli s secti on. 

^e {-1,1} 



j(cricr 2 ...a n ) = y^cri)^cr 2 )- • • ^ff) r{a x , a 2 )r(cr 2 , cr 3 ) • • • r(g_i, a n ) 
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c[a) 



r(q S) 



exp( j^r) if a=l 
exp(— ^t) otherwse 



exp( jy) if a=6 
exp(— -^) otherwse . 



lire, Zis a nornal i zati on constant, ji is the mmnt of the nagnetic dipoles, 
H is the strength of the applied nagnetic field, A; is Bbltznann's constant, T is 
terperature, and J is a neighbor interaction constant called the exchange energy. 

The approach to mdeling correspondences that is described in this section wis 
outlined in Wis [74] [75]. Subsequently, Betel [9] described a si rilar local interac- 
tionmdel of occlusioninconjunetionwthasinplifiedstatistical rodel of recognition 
that used bool ean features i n a cl assi ficati on based schena. 

The Mrkov correspondence rodel is not used in the experi rants reported in this 
research 



2. 4 I rcor por at i rg Sal i ercy 

Aether route to rare accurate mdeling of correspondences is to exploit bottomup 
saliency processes to suggest wich inage features are rost likely to correspond to 
the obj ect . Qe such process i n descri bed by III ran and Shashua [ 66] . 

Br concreteness, assuna that the saliency process provide a per- feature raasure 
of saliency, SV 13 incorporate this intonation, reconstruct ]iY , =L| S 8 ). This ray 

be conveniently calculated via Byes' rule as follow: 



AS,) 

j{S i I Ti =1) anils' 8 ) are probability densities that ray be esti rated from 
observed frequenci es in training data A in Section 2. 2, w set j(T 8 - =L) =B 
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Afeature specific baclground probability nay then be defined as follow: 






In this case the conplete probability function on T ,- wll be 

This mdel is rot used in the experi rents described in this research 



Bi if n =± 

' otherwse 



2. 5 G)rcl xbi ore 

The sinjiest of the three rodels described, the independent correspondence mdel, 
has been used to good effect in the recognition experi rents describedindapters 6, 7, 
and 10. In sore cbrai ns addi ti onal robustness i n recogri ti on right resul t f romusi ng 
ei ther the Mrbov correspondence rcdel , or by i ncorporati ng sal i ency i nf orrati on 



(2-6) 



Chapter 3 



Midel i ng I mage Feat ur es 



Kobabilistic mdels of inage features are the topic of this chapter. These are an- 
other irportant corponent of the statistical theories of object recognition that are 
descri bed i n Chapters 6 and 7. 

Tfie probability density functionf or the coordinates of inage features, conditioned 
on correspondences and pose, is defined The Hi 1 'has tw> irportant cases, depend 
ing on whether the inage feature is assigned to the object, or to the background 
Eatures natcbed to the object are nodeled wth nornal densities, -while uniform 
densi ti es are used for background features . fip ri cal evi dence i s provi ded to support 
the use of nornal densities for natcbedfeatures. Aformof stationarityis described 

Mny recognition system irplicitly use uriformdensities (rather than nornal 
densities) to mdel natcbed inage features (hounded error mdels). The emirical 
evidence of Section 3. 2. f indicates that the nornal mdel nay sonatinas be better. 
Because of this, use of nornal mdels nay provide better perfornance in recognition 
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3.1 AUri formrYfcfel for Background leat ires 

The i rege features , Y ; , are v cE nansi onal vectors . Maen assi gned to the background, 
tbey are assured to be uri f ornhy cE stri buted, 

^■iw = ^ ifr -= ± - («) 

(The HFis defined to be zero outside the coordinate space of the inage features, 
which has extent W 8 - along cE nansi on i. ) Y describes the correspondences frominage 
features to object features, and f3 describes the position and orientation, or pose of 
the object. Br exanple, if the inage features are 2Dpoints in a 640 by 480 inage, 
tbenj(y i | 1, fy = 640 * 480 , wtlin the inage. Br Y ;, this probability function depends 
only on the z' th conponent of T. 

Rovi cE ng a sati sf yi ng probabi 1 i ty densi tyf uncti onf or backgroundf eatures i s prob- 
leratical. Ejuation 3. 1 describes the naxi rnmentropy HI' consi stent wth the 
constraint that the coordinates of inage features are alrays expected to lie wtlin 
the coordinate space of the inage features. ET Jaynes [46] has argued that naxi- 
rnrrentropy distributions are the met honest representation of a state of inconplete 
knowledge. 



3.2 ANorml Mcfel for Mtcted leat ires 

Inage features that are natcbedto object features are assured to be normlly dis- 
tributed about their predicted position in the inage, 

AYi | r,/$ =G ^(Y -V(M J} $) if r t - =M 3 . (3.2) 

lire Y ;, T, and /3are defined as above. 

G,p t is the v- cE nansi onal Qussi an probabi 1 i ty densi ty f uncti on wth covari ance 
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Egure 3-1: Ere Inage features and Ere Mdel Katures 



ratrix ij) 



i j> 



G^(^) =(2vr) 2|^J 2exp(--x V 8J ^ 



The covari ance ratri x ij) 8 - j i s d scussed rare f ul 1 y i n Secti on 3. 3. 

Wai T i = M j, the predicted coordinates of inage feature Y 4 - are given by 
T^Af j, /3), the projection of object feature j into the inage wth object pose /3 Rx> 
j ecti on and pose are di scussed i n rare detai 1 in Chapter 5. 



321 Erpricd Edcfaie fa- the Norri rxffll 

Thi s secti on descri bes som erp ri cal evi dence f romtbe cbnai n of vi deo i rage edge 
features indicating that nornal probability densities are goodradels of feature fluc- 
tuations, and that they can be better than uri f ormprobabi 1 i ty densities. The ev- 
idence is provided in the formof observed and fitted emulative distributions and 
Pol mgorov- Snhrnov tests. The radel di stri buti ons were fitted to the data usi ng the 
Mxi rnmE kel i hood rat hod 

The data that is analyzed are the perpend cul ar and parallel deviations of fine 
and coarse edge features deri ved f romvi deo i rages . The fine and coarse features are 
showi i n E gures 3- 1 and 3- 3 respecti vel y 

The radel features are fromMan Kge Inages, these are described in Section 
4.4. The edge operator usedin obtaining the inage features is ridges inthenagritude 
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Egure 3-2: Ere Kature Gbrrespondences 







\j 



Egure 3-3: Garse Inage Ratures and Garse Mdel Eatures 



3.2. A NjmLMmLFCRMFGWD FEATURES 



33 




Egure 3-4: Garse Eature Gbrrespondences 
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of the i rage gradi ent , as di scussed i n Secti on 4. 4. The snoothi ng standard devi ati on 
used i n the edge detecti on was 2. and 4. pi xel s respecti vel y for the fine and coarse 
features. These features 'were also used in the experi rents reported in Section 40. 4, 
and the correspondences 'were used there as training data. 

Br the analysis in this section, the feature data consists of the average of the 
x and y coordinates of the pixels fromedge curve fragmnts - they are 2D point 
features. The features are di splayed as circular arc fragmnts for clarity The edge 
curves wre broken arbitrarily into 40 and 20 pixel fragmnts for the fine and coarse 
features respectively 

Correspondences fromimge features to mdel features wre established by a 
neutral subject using a rouse. These correspondences are indicated by heavy lines 
in Egures 3-2 and 3-4. Rrpendi cul ar and parallel deviations of the corresponding 
features wre calculated wth respect to the normls to edge curves at the imge 
features. 

Egure 3-5 show the emulative distributions of the perpendicular and parallel 
deviations of the fine features. The emulative distributions of fitted norml densities 
areplottedas heavy dots over the observed distributions. The distributions -vcre fitted 
to the data usi ng the Mxi rnrnli kel i hood mtbod - the man and vari ance of the 
norml densi ty are set to the man and vari ance of the data. These figures showgood 
agreenant bet-vcen the observed distributions, and the fitted norml distributions. 
Srilar observed and fitted distributions for the coarse deviations are sbcwiin Egure 
3-6, again wth good agreenant. 

The observed emulative distributions are showi again in Egures 3-7 and 3-8, 
this tim wth the cumulative distributions of fitted uriformdensi ties over-plotted 
in heavy dots. A before, the uriformdensi ties wre fitted to the data using the 
Mxi rnrrii kel i hood mtbod - i n tli s case the uri f ormdensi ti es are adj usted to j ust 
include the extrena data. These figures showrel ati vely poor agreenant betwenthe 
observed and fitted distributions, incorpari son to norml densities. 
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CDF and Normal Distribution for Fine Parallel Deviations 
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Egure 3-5: Oteerved Gkmlati've Dstributions and Etted N>rnal Gkmlati've Dis- 
tributions for Ere Ratures 
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CDF and Normal Distribution for Coarse Perpendicular Deviatio 
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CDF and Normal Distribution for Coarse Parallel Deviations 
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Egure 3-6: Oteerved Gkmlati've Dstributions and Etted N>rnal Gkmlati've Dis- 
tributions for (oarse Ratures 
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CDF and Uniform Distribution for Fine Parallel Deviations 
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Egure3-7: OteervedGiiilati've Dstributions aixlEttedljiforrnGrnlati^ Ds- 
tributions for Ere Ratures 



CHAPTER 3. MXELING I MGE FEATURES 



CDF and Uniform Distribution for Coarse Perpendicular Deviations 
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CDF and Uniform Distribution for Coarse Parallel Deviations 
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Egure3-8: OteervedGiiilati've Dstributions andEttedlfiformGiiiatire Ds- 
tributions for Garse Ratures 
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Tali e 3. 1: K)l rqgorov- Snhrnov 'Est s 



Kolnogprov-SiiriiDv Tests 



The K)l mgoiov- Siirrov (IS) test [59] is one ray of analyzing the agreemnt be- 
twen observed and fitted curnlative distributions, such as theories inEgures 3-5 
to 3-8. The IS test is corputed on the nagritude of the largest difference bet-veen 
the observed and hypotbesi zed (fitted) distributions. This wll be referred to as D 
The probali 1 i ty di stri buti on on thi s di stance, under the hypotbesi s that the data were 
drawi f romthe hypotbesi zed di stri buti on, can be cal cul ated Pa asyrptoti c f orrd a 
is given by 

I(I>D ) =Q{ VnD ) 

vfaere 

CO 

<W=2 J2(^) 3 ~^M^j V) , 

and D is the observed val ue of D 

The results of IS tests of the consistency of the data wth fitted norral and 
uniformdi stri buti ons are showi in Table 3.1. Iowvalues of I{D> D 

inconpatibility bet'veen the data and the hypothesized distribution. In the cases 
of fine perpendicular and parallel deviations, and coarse perpendicular deviations, 
refutation of the uni f ormrodel is strongly indicated Strong contradictions of the 
fitted nornal nadels are not indicated in any of the cases. 
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3.3 Gierted Stationary Statistics 

The covariance natrix ip ij that appears in the mdel of raided inage features in 
Ejuati on 3. 2 i s all oved to depend on both the i rage feature and the obj ect feature 
i nvol \ed i n the correspondence. Indexi ng on i al 1 ow dependence on the i rage feature 
detecti on process , v4ile i ndexi ng i n j al 1 ow dependence on the i denti ty of the mdel 
feature. Tin sis useful whensom mdel features are knowto be noisier than others. 
Thi s flexi bi 1 i ty i s carri ed through the f ornal i smof 1 ater chapters . Athough such flex- 
ibility can be useful, substantial si rplificati on results by assuring that the features 
statistics are stationary in the i rage, i.e. ij) i j =ij] for all ij. This coul d be reason 
able if the feature fluctuations were isotropic in the inage, for exarple. In its strict 
formthis assurptionray be too li rating, kwever. This section outlines a corpro- 
nhse approach, oriented stationary statistics, that ras used in the inplemntations 
described in Chapters 6, 7, and 8. 

This rat hod i moires attaching a coordinate systemto each i rage feature. The 
coordinate systemhas its origin at the point location of the feature, and is oriented 
wth respect to the direction of the underlying curve at the feature point. Mnen 
(stationary) statistics on feature deviations are raasured, they are taken relative to 
these coordinate system. 

3 3 1 Etirxtiig tie Parries 

The experi rants reported in Sections 6. 2, 7. 1, and Chapter 10 use the norral mdel 
and oriented stationary statistics for ratcbed i rage features. Ater this choice of 
mdel, it is still necessary to supply the specific paramters for the mdel, namly 
the covariance rat rices, ij) ,-j, of the norral densities. 

The paramters rere esti rated fromobservations on ratches done by hand on 
sanple i rages fromthe dorain Because of the stationarity assumption it is possible 
to esti rate the corral covari ance, ^ by observi ng natch data on one i rage. lor 
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this purpose, aratchwas done wtharouse bet-veenfeatures f rorra Man Etige Im 
age (these are descri bed i n Secti on 4. 4) and a representati ve i rage f romthe dorei n 
Bring this process, the pose of the object was the sara in the tw> i rages. This 
produced a set of corresponding edge features. Ex the sake of exarple, the process 
wl 1 be descri bed for 2Dpoi nt features (descri bed i n Secti on 5. 2) . The procedure has 
al so been used wth 2Dpoi nt- radi us features and 2Dori ented- range features , that are 
descri bed i n Secti ons 5. 3 and 5. 4 respecti vel y 

let the observed i nage features be descri bed by Y ;, and the corresponding man 

mdel features by Y{. The observed resi dual s betwentbe "data" i nage features, and 
the "man" features are A 8 - =Y i — Y{. 

The features are deri ved f romedge data, and the underlying edge curve has an 
orientation angle in the inage. These angles are used to define coordinate system 
specific to eachimge feature Y ;. These coordinate system define rotation mtrices 
Ri that are used to transf ormthe resi dual s i nto the coordi nate system of the features, 
i n the foil owng v&y. A '• =R ,-A,-. 

The stationary covariance ratrix of the ratched feature fluctuations observed 
in the feature coordinate system is then esti rated using the Mxi rnmli kel i hood 
nathod, as follow, 

n 
lire T denotes the ratrix transpose operation This technique has sona bias, but 
for the reasonabl y 1 arge sanpie sizes involved (nss 100) the effect is riinor. 

The resul ti ng covari ance ratri ces typi cal 1 y i ndi cate 1 arger vari ance for devi ati ons 
along the edge curve than perpendicular to it, as suggested by the data in Egures 
3- 5 and 3- 6. 
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332 SpBddizirg tie Cc*aiaie 

A reccgritiontim, it is necessary to specialize the constant cowianee to each inage 
feature. TM sis done by rotating it to orient it wth respect to the inage feature. 
Acowianee natrix transform like the followng product of residuals: 



'A' T 

% % 



A'A 



This is transf ornad back to the inage systemas follow, 



Rf A^ % 



Thus the constant cowi ance i s sped al i zed to the i rage features i n the f ol 1 owng my 



V> 8 3 =R J-iJR i 



Chapter 4 



Midel i ng Object s 



Mjkt is needed fromobject nodels? Br recognition, the rain issue lies in predicting 
the i rage features that wll appear inaninageof the object. Should the object mdel 
be a mnolithic 3Ddata structure? A"ter all, the object itself is 3D In this chapter, 
sora pros and cons of ronolithic 3Drodels are outlined Ai alternative approach, 
interpolation of view, is proposed The related problemof obtaining the object 
mdel data is discussed, audit is proposed that the object rodel data be obtained 
by taki ng pi ctures of the obj ect . Ai autorati c rathod for tli s purpose i s descri bed 
AHtionally a mans of edge detection that captures the average edges of an obj ect 
is described 



4.1 IVMitKc 3D Object IVfefels 

Qe mtivationfor using 3Dobject rodel s in recognition system is the observation 
that conputer graphics techniques can be used to synthesize convi nci ng i rages from 
3Dmdel s i n any pose desi red 

Br som objects, havi ng a si ngl e 3Drodel seemanatural choice for a recognition 
system If the object is polygonal, andis representedbyalist of 3Dline segmnts and 
verti ces , then predi cti ng the features that wll appear i n a gi ven hi gh resol uti on vi ew 
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is asinjie ratter. Al that is needed is to apply a pose dependent transforrationto 
each feature, and to performa visibility test. 

Ex other objects, such as snootliy curved objects, the situationis different. Re- 
di cti ng features beconas rare el aborate. In vi deo i nagery occl udi ng edges (or li vtbs) 
are often irportant features. Calculating the linbof a srooth 3Dsurf ace is usually 
corolicated lonce and Kiegnan [58] describe an approach for objects rmdeled 
by paranatric surface patches. Agebraic elinhnation theory is used to relate inage 
linhs to the mdel surfaces that generated them Rooks' vision system Aronym 
[ 10] , al so reccgni zed curved obj ect s f romi rage 1 i nhs . It used general i zed cyl i nder s 
to mdel objects. Adravbackof this approachis that it is awward to realistically 
mdel i ng typi cal objects, like telephones or autombiles, wth generalized cylinders. 

Redi cti ng reduced resol uti on i rage features i s another di ffiul ty wth ronol i tli c 
3Dmdels. This is a drawback because doing recognition wth reduced resolution 
features is an attractive strategy: wthferar features less search wll be needed Cite 
solution w)uld be to devise a ray of smothing 3Dobject mdels such that sinple 
proj ecti on operati ons w>ul d accuratel y predi ct reduced resol uti on edge features . N> 
such mtbod i s knowi to the author. 

Ejecting reduced resolution irage features is straightforrard (Sod edge fea- 
tures of this sort ray be obtained by smothing the grayscale irage before using an 
edge operator. This mtbod is connnnly used wth the Canny edge operator [13] , 
and wth the Mrr- HI dreth operator [ 53] . 

Ai alternative approachis to do proj ecti ons of the object mdel at full resolution, 
and then to do sore kind of smothing of the irage. It isn't clear wmt sort of 
smotli ng w>ul d be needed Cite possi lilityis todo pbotomtri cal 1 y real i sti c proj ec- 
ti ons (for exanple by ray tracing rendering), performsmothing in the irage, and 
then use the sam feature detection scbem as is used on the i rages presented for 
reccgni ti on. Thi s mtbod is likelytobetoo expensi ve for practi cal reccgni ti on system 
that need to perform! arge amunts of prediction. lerhape better rays of doing this 
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wll be found 

Self occlusion is an additional complexity of the mnolithic 3Dmdel approach 
In corputer graphics there are several ways of dealing wth this issue, amng them 
hidden line and z-buffer nathods. These mthods are fairly expensive, at least in 
corparisonto sparse point projections. 

In sunoary mnolithic 3Dobject mdels address som of the reqii remits for 
predicting imges for recognition, but the corputational cost my be high 



4. 2 lit erpol at i on of Vi evs 

Qe approach to avoi di ng the di fEul ties di scussed i n the previ ous secti on i s to use an 
i mge- based approach to obj ect rodeling. IIlmnandRsri [71] have di scussed such 
approaches . There i s som bi ol ogi cal evi deuce that ani ml vi si on system have recog- 
nition subsystem that are attuned to specific view of faces [25] . This my provide 
som assurance that i mge- based approaches to recognition aren't unreasonable. 

Ai irportant issue wth i mge- based object mdeling concerns howto predict 
i mge features i n a way that covers the space of poses that the obj ect my assure. 

R>dies undergoing rigid notion in space have six degrees of freedom three in 
translation, andthreeinrotation This sixparamter pose space mybe split intotw) 
parts -the first part bei ngtranslati on and i n i mge- pi ane rotations (four paramters) 
- the second part being out of imge-piane rotations (tw> paramters: the "view 
sphere"). 

Synthesizing view of an obj ect that span the first part of pose space can of ten 
be done using sirple andeffiient linear nathods of translation, rotation, and scale 
i n the pi ane. Thi s approach can be preci se under orthograpii c proj ecti on wth seal - 
i ng, and accurate enough i n som cbmi ns wth perspecti ve proj ecti on Rrspecti ve 
projection is often approxi rated in recognition system by 3D rotation conbined 
wth orthographic projection and scaling. This has been called the weak perspective 



46 CHAPTER 4. MXELINGCBJEGTS 

approxi rati on [ 70] . 

The second part of pose space, out of plane rotation, is rare conpiicated The 
approach advocated i n tli s research i nvol ves tessel ati ng the vi ewspbere around the 
object, and storing a view of the object for each vertex of the tessel ati on Abitrary 
view wll then entail, at met, snail out of plane rotations fromstoredview. These 
view nay be synthesized using interpolation The linear Gain nation of Vew 
rathodof III ran and Rsri [71], w>rks wll for interpolating bet wen near by view 
(and rare distant ones, as 'Will). 

Conceptually the interpolation of view mthod caches pre- corputed predictions 
of i rages, saving the expense of repeatedly corputing tbemduring recognition If 
the tessel ati on is dense enough, diffiiities owng to large changes in aspect ray be 
avoided 

Beuel [9] advocates a view based approach to mdeling, wthout interpolation 



4.3 Object jYkfels fromOteervation 

Kwcan object mdel features be acquired for use in the interpolation of view 
franawjrk? If a detailed (JDnodel of the object is available, then view right be 
synthesized using graphical renderi ng prograna (this approach wis usedinthe (single 
vi ew) 1 aser radar experi rant descri bed i n Secti on 7. 3) . 

Aether rathodis to use the object itself as its owimdel, and to acquire view 
by taking pictures of the object. This process can rake use of the feature extraction 
rat hod that is used on i rages at recognition tira. Ai advantage of this scherais 
that an accurate COstyl e mdel i sn' t needed Ui ng the run- ti ra feature extracti on 
rachani smof the recognition systemautorati call y selects the features that wll be 
salient at recognition tira, wichis otherwse a potentially diffiult problem 

Qe diffiulty wth the radels fromobservati on approach is that irage features 
tendto be soravhat unstali e. Ibr exanpl e, the presence andl ocati on of edge features 



4.4. MAN EDGE I MiGES 47 

is infiueneedby ill urination condit ions, as illustrated in the fol loving figures. Egure 
4-1 show a series of nine grayscale i rages where the only variation is in lighting. A 
correspondi ng set of edge i rages i s shown i n 4- 2. The edge operator used i n prepari ng 
the i rages is described in Section 4.4 The standard deviation of the snoothing 
operator ras 2 pixels. 



4. 4 lYIan Ecjge Irages 

It ras pointed out above that the instability of edge features is a potential diffiulty 
of acquiring object rodel features fromobservation. The ManKge Irage nathod 
solves this probl emby raki ng edge raps that are averaged over variations due to 
i 1 1 urinati on changes . 

Brightness edges ray be characterized as the ridges of a naasure of brightness 
variation. This is consistent wththe cornan notion that edges are the lDloci of 
raxiraof changes inbrightness. The edge operator usedinEgure 4-2 is anexanple 
of this style of edge detector. It is a ridge operator applied to the squared di screte 
gradient of s root bed irages. Hre, the squared di screte gradient is the naasure of 
brightness variation. This style of edge detection ras described by Mrcer [57] . The 
ratheratical definition of the ridge predicate is that the gradient is perpendicular to 
the di recti on havl ng the rost negati ve second di recti onal deri vati ve. Aether si ml ar 
defiri ti on of edges was proposed lirali ck [ 37] . Br a general survey of edge detecti on 
nathods, see Robot Vision, by Urn [39]. 

The precedi ng characteri zati on of i rage edges general i zes natural 1 y to man edges . 
Man edges are defined to be ridges in the average naasure of brightness fluctuation. 
In this w>rk, average brightness fluctuation over a set of pictures is obtained by 
averaging the squared di screte gradient of the (snaothed) irages. 

E gure 4- 3 show the averagedsquaredgradi ent of snaothed versi ons of the i rages 
that appear i n E gure 4- 1. Rxal 1 that onl y the 1 i ghti ng changed bet-veenthese i rages . 
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E gure 4- 1: Gayscal e Inages 
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Egure4-2: Kjge Inages 
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Egure 4-3: Averaged Squared Gadi ert of Smothsd Inages 

Egure 4-4 show the ridges fromthe inage of Egure 4-3. fysteresis thresholding 
based on the nagnitude of the averaged squared gradient has been used to suppress 
weak edges. Such hysteresis thresholding is used wth the Canny edge operator. Mte 
that this edge inage is relatively inline to specular highlights, in comparison to the 
i ndi vi dual edge i rages of E gure 4- 4. 



4.5 A±omtic 3D Object jYkfel Axjiisition 



This section outlines anathodfor autoratic 3Dobject mdel acquisition that com 
bines interpolation of view and Man Kjge Inages. The rat hod involves automti- 
cal 1 y acqui ri ng (nany) pictures of the object under various conbi nations of pose and 
illunhnation Aprelirirnaryirplemntationof tberatbodwas used to acqui re object 
mdel features for the 3Dreccgriti on experiment discussed in Section 10. 4. 
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Egure 4-4: Rdges of Aerage Squared Gadi ent of Smothsd Irages 
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Egure4-5: ARntaMs Ebdecahedron 



The obj ect , a pi asti c car rodel , was rounted on the tool flange of a HJA560 
robot. A video canara connected to a Sun Microsystem MC video digitizer wis 
rounted near the robot. 

Br the purpose of Interpolation of Vew object rodel construction, the view 
sphere aroundthe object tos tesselatedinto32 viewpoints, the vertices of apentakis 
cbdecahedron(oneis illustratedinEgure4-5). A each viewpoint a "canonical pose" 
for the object wis constructed that oriented the viewpoint towards the canara, wile 
keepi ng the center of the obj ect i n a feed posi ti on 

Nne different configurations of lighting wre arranged for the purpose of con 
structing Man Kjge Inages. The lighting configurations wre nade by roving a 
spotlight to nine different position that illunhnated the object. The lanp positions 
roughly covered the vi ewbenhsphere centered on the canara. 

The object wis roved to the canonical poses corresponding to the 21 vertices in 



4.5. AUT0MJ1 C 3D OBJECT MJDEL ACQUISITION 53 

the upper part (roughly 2/3) of the object's viewsphere. A each of these poses, 
pictures were taken wth each of the nine lanp positions. 

Man Kjge Ireges at various scales of smothing were constructed for each of 
the canonical poses. Object nodel features for recogritionexperinants described in 
Chapter 8 were derived fromthese Man Kjge Inages. Twenty of the i rages from 
one such set of Man Edge Inages are di spl ayed i n B gures 4- 6 and 4- 7. 

Tw> of these Man Kjge Inages were used in an experimnt in 3Dreccgnition 
using a tw>viewli near CorM nation of Vew mthod This nathod requires corre- 
spondences arorg features at di fieri rg view. These correspondences were established 
by hand, usi ng a muse. 

It is likely that such feature correspondence could be derived fromthe results 
of a nation program Sbashua's notion program [65] , which corianes geomtry 
and optical fbvy was tested on i rages fromthe experinantal setup and was able 
to establish good correspondences at the pixel level, for views separated by 4.75 
degrees. This range could be increased by a sequential bootstrapping process. If 
correspondences can be autoratically deterrined, then the entire process of building 
view based mdels for 3Dobjects can he mde fully automtic. 

Ater performngtheexperinants reportedin Chapter 10, it becana apparent that 
the views were separated by too large of an angle (about 38 degrees) for establishing 
a good arnunt of feature correspondence between sona vi ews . TEi s probl emmy be 
rel i eved by usi ng rare vi ews . lii ng rare vi ews al so rakes autorati c deternhnati on 
of correspondences easier. If the process of mdel construction is fully automtic, 
havi ng a rel ati vel y 1 arge nunfaer of vi ews i s potenti al 1 y workabl e. 

The wark of Taylor and reeves [69] provides sona evidence for the feasibility of 
nnl ti pi e- vi ew based reccgri ti on They descri be a cl assi ficali on based vi si on system 
that uses a library of views froma 252 vertex icosahedron based tesselation of the 
viewsphere. Their views were separated by 6.0 to 8.7 degrees. They report good 
classification of aircraft silhouettes using this approach 
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E gure 4- 6: Man Kjge Inages at Gkoi cal Bses 
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E gure 4- 7: Man Kjge Inages at Gkoi cal Bses 
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Chapter 5 



Midel i ng Pr oj ect i on 



This chapter is concerned wth the representations of inage and object features, and 
wth the projection of object features into the inage, given the pose of the object. 
Bur different fornnlations are described, three of which are used in experi rents 
reported i n other chapters. 

The first three mdels described in this chapter are essentially 2E) the trans- 
forrations conprise translation, rotation, and scaling in the plane. Such rat hods 
ray be used for single view of 3Dobjects via the wak perspective approxination, 
as described in [70] . In this schera, perspective projection is approxi rated by or- 
thographic projection wth scaling. Whin this approxiration, these rathods can 
hande four of the sixpararaters of rigid body ration- ever ytling but out of plane 
rotations. 

The rathod descri bed i n Sscti on 5. 5, is based on Ii near (onhi nati on of Vew , 
a vi ew based 3Drathod that wis devel oped by III ran and Rsri [ 71] . 



5.1 Linear H*oj ect ion Mdels 

lose deternhnationis of ten a component of mdel - based obj ect reccgritionsystena, 
ineludng the system describedintlis thesis. Icse deternhnationis frequentlyfrarad 
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as an optinhzation problem The pose determnation problemnay be significantly 
si npl i fied i f the feature proj ecti on mdel i s 1 i near i n the pose vector. The system de- 
scribedinthis thesis use proj ecti on mdel s having this property, this enables solving 
the enhedded optinhzation problemusing least squares. least squares is advanta- 
geous because unique solutions nay be obtained easily in closed form This is a 
significant advantage, since the enhedded opti nhzati on problenis solvedmnytims 
duri ng the course of a search for an obj ect i n a scene. 

Al of the formlations of projectiondescribedbeloware linear in the paramters 
of the transformtion ftcause of this they my be written in the fol loving form 



ru=T{M „/$=M i/3. 



(5.1) 



The pose of the object is represented by @ a col una vector, the object mdel 
feature by M ;, a mtrix. -q ,-, the projection of the mdel feature into the imge by 
pose p is a col una vector. 

Athoughthis particular f ormnay seemodd, it a natural one if the focus is on 
solving for the pose and the object mdel features are constants. 



5. 2 2DM it Eatire Mcfel 

The first , and si npl est , natbod to be descri bed ras used by Bugeras and Apache i n 
their vision systemBHR[ 1] . It is defined as follow: -q 8 - =M ifi, where 
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The coordinates of object mdel point i are p 



sndp i v Tbe coordinates of the 
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mdel point z, projectedintotheimgebypose/^ arep ' ix sndp '^ This transformation 

is equivalent to rotation by 6 } scaling by s, and translation by T } where 



T-- 



/j 2 +v 



#=arctan 



.», 



This representation has an un- syrnatri cal ray of representing the tw> classes 
of features, which seem odd due to their essential equivalence, however the trick 
f aci 1 i tates the 1 i near f ornnl ati on of proj ecti on gi ven i n Equati on 5. 1. 

In tli s mdel , rotati on and seal e are effected by anal cgy to the nnl ti pi i cati on of 
conplexnunbers, vHchinduces transforrations of rotati on and scale inthe conplex 
plane. This analogy ray be mde corplete by noting that the algebra of conplex 
nunbers a-\-ib i s i sonorphi c wth that of ratri ces of the form 

a b 
-b a 



5. 3 2DM it-Rid ib Eatire Mcfel 



This section describes an extension of the previous feature mdel that incorporates 
i nf orrati on about the norml and curvature at a poi nt on a curve (i n add ti on to the 
coord nate inf orrati on). 

There are advantages i n usi ng ri cher features i n reccgri ti on - they provi de rare 
constraints, and can lead to space and tim effiiencies. These potenti al advantages 
nnst be weighed against the practicality of detectingthe richer features. Br exanple, 
there i s i ncenti ve to construct features i ncorporati ng hi gher deri vati ve i nf orrati on at 
a point on a curve; however, masuri ng hi gher derivatives of curves derivedfromvideo 
i ragery i s probabl y i qracti cal , because each deri vati ve mgri ties the noi se present 
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Bgure5-1: Kjge (nrve, Qculating Grcle, andKidius \ector 

in the data. 

The feature descri bed here i s a conprorise betreen ri chness and detectabi 1 i ty It 
is defined as follow i] ,- =M jft where 
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The point coordinates and f3 are as abore. c ^andc ^represent the radius vector 

of the curve's osculating circle that touches the point on the curve, as illustrated 
in Egure 5-1. This vector is nornal to the curve. Its length is the inverse of the 
curvature at the poi nt . The counterparts i n the i rage are gi ven by c 
this mdel, the radius vector c rotates and scales as do the coordinates p, but it does 
not translate. Thus, the aggregate feature translates, rotates and scales correctly. 

This feature mdel is used in the experi rents descri bed in Sections 6.2, 7.4, and 



■j-andc \ v 1th 
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10.1 Mien the underlying curvature gees to zero, the length of the radius vector 
diverges, and the direction beconas unstable. This has been acconnodated in the 
experi rants by truncating c Athoughthis violates the "transforna correctly" crite- 
rion, the mdel still wxks 'well. 

5.4 2DOiei±ed-Rr^eEatireMcfel 

This feature projection mdel is verysirilar to the one described previously It was 
desi gned for use i n range i nagery i nstead of vi deo i nagery Ii be the previ ous feature, 
it is fitted to fragmnts of inage edge curves. In this case, the edges label discon- 
tinuities in range. It is defined just as above in Section 5. 3, but the interpretation 
of cis different. The point coordinates and /3 are as above. A above, c i x sndc i y 

are a vector whose direction is perpendicular to the (range discontinuity) curve frag- 
rant. The difference is that rather than encoding the inverse of the curvature, the 
1 ength of the vector encodes i nstead the i nverse of the range at the di sconti mi ty The 
counterparts i n the i rage are gi ven by c ' ix and c '• ^ The aggregate feature transl ates, 

rotates and seal es correctl y -when used wth i ragi ng mdel s -where the obj ect features 
scale according to the inverse of the distance to the object. This holds under per- 
specti ve proj ecti on wth attached range 1 abel s when the obj ect i s sral 1 corpared to 
the distance to the object. 

Thi s mdel wis used i n the experi rants descri bed i n Secti on 7. 3. 

5.5 linear (Srinrationof Vew 

The technique used in the above rathods for synthesizing rotation and scale amunts 
to raking linear conH nations of the object mdel wth a copy of it that has been 
rotated 90 degrees i n the pi are. 

In their paper, "ftccgriti on by linear (oria nation of Mdels" [71], III ran and 
Rsri describe a sclera f or synthesizing view under 3Dorthography wth rotation 
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and scale that has a linear paranaterization. They showthat the space of i rages of 
an obj ect i s a subspace of a 1 i near space that i s spanned by the corponents of a few 
i rages of an obj ect. They di sens s variants of their form! ati on that are based on tw> 
view, and on three and rare view, recovering conventional pose paranaters from 
the 1 i near coria nati on coeffii ents i s descri bed i n [ 60] . 

The f ol 1 owng is a bri ef expl anati on of the t w> vi ewmthod The reader i s referred 
to [71] for a fuller description. Ibint rxojectionfrom3Dto2Dunder orthography ro- 
tation, andscaleis a linear transformation. If tw>(2p view are available, along wth 
the transforrations that produced them( as in stereo vision), then there is enough 
data to invert the transforrations and solve for the 3Dcoordi nates (three equations 
are needed, four are available). The resulting expression for the 3Dccord nates wll 
be a linear equation in the corponents of the tw> view. Mw2Dview ray then 
be synthesizedfromthe 3Dcoord nates by yet another linear transforration. (om 
pound ng these linear operations yields an expression for new2Dview that is linear 
in the corponents of the original tw> view. There is a quadratic constraint on the 
3Dto2Dtransforrations, due to the constraints on rot at ion rat rices. Theusual lin- 
ear (oria nation of Vew approach rakes use of the above linearity property wile 
synthesizingnewview wth general linear transforrations (wthout the constraints). 
This practice leads to tw> extra paranaters that control stretching transforrations 
i n the synthesi zed i rage. It al so reduces the need to deal wth canara cal i brati ons - 
the pixel aspect ratio ray be accomodated in the stretching transforrations. 

The foil owng projecti on rodel uses atwviewvariant of the linear GkM nation 
of Vew nathodto synthesize view wth 1 i rated 3Drot at i on and scale. AHtionally 
transl ati on has been added i n a strai ghtf ormrd my i] 8 - =M ifi, were 
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and 



P~- 



PohfafaP&fch 



i y> 



The coordinates of the i 'th point in one view are p ,- x and p 

tbey are q , x and q , ^ 

\feithis projectionmdel is used, /3cbes not in general describe rigid transfer 
ration, but it is nevertheless called the pose vector for rotational consistency. 

Thi s mtbod i s used i n the experi nant descri bed i n Secti on 10. 4. 



in the other view 
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Chapter 6 



MP Mdel Mtching 



MPMdel Mtcling 1 (WB^t is the first of tw> statistical formlations of object 

recognition to be discussed in this thesis. It builds on the rodels of features and 
correspondences, objects, and projection that are described in the previous chapters. 
NBtfevaluates joint hypotheses of natch and pose in terra of their posterior prob 
ability, gi ven an i nage. NBMis the starting point for the second fornAati on of 
object recognition, losterior Mrginal lose Bti nation (Rvffi), 'which is described 
in Chapter 7. 

The NBMobjective function is amnable to search in correspondence space, 
the space of all possible assignmnts frominage features to rodel and background 
features. This style of search has been used in nany recognition system, and it is 
used here i n a reccgri ti on experi rant i nvol vi ng 1 owresol uti on edge features . 

It is showithat under certain conditions, searcli ng i n pose space for naxinaof 
the NBMobjective function is equivalent to robust rathods of charier mtcling 
[47]. 



1 Early versions of this work appeared in [74] and [75] . 
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6. 1 Objective Rmctionfor Rise and Gkrespon- 
dences 



In this section an objective function for evaluating joint hypotheses of natch and 
pose using the MPcriterion wll be derived 

Siehy, probability densities of i rage features, conditioned on the pararaters of 
natch and pose ("the paranaters"), are conbined wth prior probabilities on the 
paranaters using Ryes' rule. The result is a posterior probability density on the pa- 
ranaters, gi ven an observed i nage. Aiestinate of the paranaters is tbenfornnlated 
by choosing tbemso as to naxinhze their a- posteriori probability (ffaxe the term 
NAP. See tick and Anol d s textbook [ 4] for a di scussi on of MPesti rati on ) MP 
estimators are especially practical when used wth nornal probability densities. 

This research focuses on feature based recognition The probabilistic mdels of 
i rage features descri bed i n Oapter 3 are used Iriti ally, i nage features are assured 
to be mtually independent (this is relaxed in Section 6. 1. 1). AMtionally ratched 
i rage features are assured to be norrally distributed about their predicted positions 
in the irage, and unnatched (background) features are assured to be urifornhy 
distributed in the inage. These densities are conbined wth a prior nodel of the 
paranaters. Mnen a linear projection nadel is used, a si rple objective function for 
natch and pose results. 

h described in Chapter 2, the inage that is to be analyzed is represented by a 
set of v- di rensi onal col um 'vectors. 

Y={Y u Y 2 ,...,Y n } , YieR v ■ 

The obj ect nadel i s denoted by ]\d 

ik£={M !,M 2 ,...,M m } . 
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Mien linear projectionmdels are used, as discussedin Chapter 5, the object features 
wll be represented by real natrices: M 3 G_R" Xz (z is defined bel ov\) . 

The paranaters to be esti rated in natcling are the correspondences bet'veen 
inage and object features, and the pose of the object in the inage. A discussed in 
Section 2.1, the state of natch, or correspondences, is described by the variable V. 

T={T ^IV..,^} , r,-GMJ{J) . 

Hre T i =M } naans that i nage feature i corresponds to obj ect nodel feature j, and 
Ti =L naans that inage feature i is due to the background 

The pose of the obj ect is a real vector: /3£i? z . Aprojecti on function, 7\) } naps 

obj ect mdel features i nto the v- di mnsi onal i nage coordi nate space accordi ng to the 
pose, 

The probabi 1 i sti c nodel s of i nage features descri bed i n Chapter 3 nay be wi tten 
as follow: 

iY t \ r,/$ = 



if D =L 



w 1 w 2 ~ ■ w 
Gfcft-^J) ifr,-=M 



(6.1) 



where 



G^(a) =(2ti) 2 | ^| I exp(- -x T ^ ^ 



Hre if) i ; j i s the covari ance natri x associ ated wth i nage feature i and obj ect mdel 
feature j. Thus i nage features ari si ng f romthe background are uri f ornhy di stri buted 
over the inage feature coordinate space (the extent of the inage feature coordinate 
space along di mnsi on i is given by W ;), and natched inage features are nornally 

di stri buted about thei r predi cted 1 ocati ons i n the i nage. In sona appl i cati ons r/coul d 
be independent if i and j - an assunption that the feature statistics are stationary 
in the inage, or ^ray depend onl y on z, the inage feature index The latter is the 
case when the oriented stationary statistics mdel is used (see Section 3. 3). 
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Asunhng independent features, the joint probability density on inage feature 
coordinates nay be wittenas follow 

iY\ v^= n^.-i u= n WiW ]... w j{ G^Yi-nMj,® . 

1 % '.Li — =J_ % J '. J_ — IVl j 

(6.2) 
lis assunpti on often holds when sensor noise dorinates in feature fluctuations. 

The next step in the derivation is the construction of a joint prior on correspon- 
dences and pose. In Chapter 2, probabilistic mdels of feature correspondences were 
discussed The independent correspondence mdel is used here for simplicity lie of 
the Mrkov correspondence mdel is discussed in the followng section. The proba- 
filitythat inage feature i belongs to the backgroundis B ;, while the renaining prob- 

ability is urifornliy distributed for correspondences to the m object mdel features. 
In sora situations, B ; nay be a constant, independent of i. Real 1 i ng Ejuat ions 2.1 

and 2. 6, 

' Bi if D =L 



l-Bi 



otberwse . 



iT}= HAT;) and ^r ,-) = 

i 

EH or inforrationontbe pose is assumdto be supplied as a norml density, 

M =G ^(/?-/3 o) 
where 



(6.3) 



G^(a) =(2ti) 2*|^| §exp(- -£ T V>/a) 



2 

lire ij) p is the covarianee natrixof the pose prior and z is the dimnsionality of 
the pose vector, /3 Whtbe conbi nation of nornal pose priors and linear projection 
mdels the systemis closed in the sense that the resulting pose estinate wll also 
be nornal. This is convenient for coarse- fine, as discussed in Section 6. 4. If little is 
knowi about the pose a- priori, the prior nay be mde quite broad This is expected 
to be often the case. If nothing is knowi about the pose beforehand, the pose prior 
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ray be left out. In that case the resulting criterion for evaluating hypotheses wll be 
based on Mxi mmli kel i hood for pose, and on MP f or correspondences . 

Asuring independence of the correspondences and the pose (before the inage is 
conpared to the obj ect rmdel ) , a rirxed j oi nt probali 1 i ty f uncti on ray be wi tten as 
follow, 

ir^=G i, p (p-p o) n Bi n ] -^- L ■ 

This a good assurption -when view based approaches to object rodeling are used 
(these are di scussed i n Chapter 4 and used i n the experi rents descri bed i n Chapter 
10). (Wh general 3Drotationit is inaccurate, as the visibility of features depends 
on the ori entati on of the obj ect . ) Thi s probali 1 i ty f uncti on on natch and pose i s now 
used wth Byes' rule as a prior for obtaining the posterior probability of Tand/3 

were jiY) = ^2 r Jd/3j{Y\ T,fyj(T,fy is a nor nalizat ion factor that is fornally 
the probability of the inage. It is a constant wth respect toTand/^ the paranaters 
being esti rated 

The MP strategy is used to obtain esti rates of the correspondences and pose 
by raxinhzing their posterior probability wth respect toTand/^ as follow 

if^rgrax fa f3\ Y) . 

lor convenience, an objective function, L, is introduced that is a scaled logarithm 
of fa, f3\ Y). The sana esti rates wll result if the raxinhzationis instead carried 
out over L 

T, /3=argmx r ^r, /$ 



were 



W/^m (^5_). ,6.5) 
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The deronhnator in Equation 6. 5 is a constant that has been chosen to cancel con 
stants fromthe numrator. Its \alue, which is independent of Tand/3is 

B 1 B 2 ■ ■ ■ B /0 _, =*, , ,=i 1 

C = T^TWr ^^ 2 \'h\ 2 



(W 1 W 2 - ■ ■ W) nK ' Irpi 0) 
Kiev sona nani pul ati on the obj ecti \e f uncti on nay be expressed as 

m®=- J(w „fV(/¥ o)+ E [\r\{Yi^\M i^y^v^M i,^)] 

z ij-.r=Mj z 

(6.6) 

\{ty 2m B * \4ij\ 2 J 

Mfei a linear projection rodel is used, 1\M J} /3) =M j/3 (linear projection 

mdels were discussed in Chapter 5.) In this case, the objective function takes the 
followng sinple form 

m®=- \{w of^m o)+ e [\ 3 -\(Y-My T ^-j(Y-My] . (&8) 

Mfei the background probability is constant, and when the feature covariance 
mtrix deternhnant is constant (as -when oriented stationary statistics are used), the 
forrnlas si nplify further - 

A=ln \w^^ — W^J ' ( } 

and 

m®=- J(WoWWo)+ E [^\{Y-My T ^ 1 {Y-My] .(6.10) 
z u:r=M 3 z 

lire, ipis the stationary feature covariance mtrix, and ij) , is the specialized 

feature cowi ance natri x. These were di scussed i n Secti on 3. 3. 
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The first termof the obj ecti ve function of Equation 6. 8 expresses the influence of 
the prior on the pose. A discussed above, when a useful pose prior isn't available, 
thi s termnay be dropped 

The second termhas a sinple interpretation It is a sumtakenover those inage 
features that are ratched to object rodel features. The A 8J are fixed regards for 

reMng correspondences, while the quadratic form are penalties for deviations of ob 
served i rage features f romthei r expected posi ti ons i n the i rage. Thus the obj ecti ve 
function evaluates the araunt of the inage explained in terra of the object, wth 
penalties for nhsnatch This objective function is particularly sinple in terra of fi 
MnenTis constant, /3andits (posterior) covariance are esti rated by weighted least 
squares. Afei using an algorithmbased on search in correspondence space, the es- 
ti rate of f3 can be cheaply updated by using the techniques of sequential pararater 
estimation (See ftckand Anold [4] .) The A t \ 3 describe the relative value of a rat ch 

corponent or extension in a way that allow direct corpari son to the entailed ris- 
ratch penalty The values of these trade- off paranater(s) are suppl i ed by the theory 
(in Equation 6. 7) and are given in terra of raasurable cbnain statistics. 

The f ormof the obj ecti ve f uncti on suggests an opti nhzati on strategy: rake cor- 
respondences to object features in order to accunnlate correspondence rewards while 
avoiding penalties for nhsnatch It is inportant that the A 8J be positive, otherwse a 

wnri ng strategy i s be to rake no ratches to the obj ect at al 1 . TEi s condi ti on defines 
a critical level of i rage clutter, beyond which the Affcriteri a assigns the feature to 
the background A t \ 3 describes the dependence of the value of ratches on the armunt 
of background clutter. If background features are scarce, then correspondences to 
object features becona rare inportant. 

This objective function provides a sinple and uriformray to evaluate natch 
and pose hypotheses. It captures inportant aspects of recognition the araunt of 
irage explained in terra of the object, as -veil as the mtrical consistency of the 
hypothesis; audit trades themoffin a rational way based on cbnain statistics. ]Vfet 
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previous approaches have rot mde use of both criteria si mil taneously in evaluating 
hypotheses, thereby losing sona robustness. 



6.11 Usirg tie Mkar QuespaifeBe MH 

Mfei the Mrbov correspondence rodel of Secti on 2. 3 i s used i nstead of the i ndepen 
dent correspondence rodel , the functional forrmf the objective function of Equation 
6.6 renal ns essentially unchanged, aside fromgairing a newtermthat captures the 
influence of the interaction of neighboring features. The nanas of sona of the con 
stants changes, reflecting the difference bet-veen Equations 2. 2 and 2. 4. Nsting that 
j(T, f3 \ Y) is linear inj(I), it can be seen that the newtermin the logarithnhc 
obj ecti ve f uncti on wl 1 be: 

n-l 

^inr 8 (r„r 8+ i) . 

i=l 
A before, when an al gori thmbased on search i n correspondence space i s used, the 
esti rate of /3can still be cheap! y updated Achange i n an el enant of correspondence, 
sona T j-, wll nowadditionally entail the update of tw>of the terra inthe expression 
above. 



6.2 Ekperirartal iEplerertation 

In tli s secti on an experi nant deronstrati ng the use of the NBtfcbj ecti ve f uncti on 
is described The intent is to deronstrate the utility of the objective function in a 
cbrainof features that have significant fluctuations. The features are derived from 
real imges. The cbnainis natcringamng features frorrlowresolutionedge imges. 
The poi nt- rad us feature rodel d scussed i n Secti on 5. 3 i s used Qi ented stati onary 
statistics, as described in Section 3. 3, are used to rodel the feature fluctuations, so 
t nat A i j ^ A ^ . 
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621 SeEndi in QuespaikBe Space 

God solutions of the objective function of Equation 6.8 are sought by a search in 
correspondence space. Search owr the vhol e exponenti al space i s a\oi dedby heuri sti c 
pruning. 

Ai objective function that evaluates a configuration of correspondences, or natch 
(described by I), ray be obtained as follow: 

£(r)=nax^ 41; # . 

This optinhzationis quadratic in /3 and is carried out by least squares. Sequential 
techniques are used so that the cost of extending a partial natch by one correspon- 
dence is 0(1) . 

The space of correspondences ray be orgari zed as a di rected acycl i c- graph ( DQ 
by the followng parent- child relation on ratcbes. Apoint in correspondence space, 
or watch is a child of another natch if there is sora i such that T ,- =|_inthe parent, 

and T i =M J} for sona j, in the child, and they are otherwse the sana. Thus, the 
cli 1 dhas one rare assi gnmnt to the mdel thanthe parent does. Thi s DQ s rooted 
in the ratch where all assignmnts are to the background Al possible ratcbes are 
reachable fromthe root. Afragnant of anexanple DGof this kind is illustrated 
inEgure 6-1. G)rponents of ratcbes that are not explicit in the figure are assigned 
to the background 

Heuristic bearrBearch, as describedin [64] , is usedto search orer ratches for good 
solutions of C Success depends on the heuristic that there aren't ranyirpostors in 
the irage. Aiirpostor is a set of irage features that scores -veil but isn't a subset 
of the opti rnmratch i rpl i ed by the obj ecti ve f uncti on. Aotber way of stati ng the 
heuristic is that the best ratch to n+1 object features is likely to contain the best 
ratch to nobject features. 

The search rathod used i n the experi nants enpl oys a bootstrappi ng nachani sm 
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E gure 6- 1: Eagnant of Correspondence Space DG 

based on distinguished features. Object features 1, 2 and 3 are special, and rnst 
be detected The schena could be nade robust by considering rare initial triples 
of object features. Aternati'vely indexing rat bods coul d be used as aneffiient and 
robust mans to initiate the search Indexing mtbods are described by (jemns and 
Jacobs [ 19] , and i n Sscti on 9. 1. 

The algorithmthat was used is outlined below 



BEAM-SEARCH(JVfF) 

CURRENT <— {V. exactly one inage feature is mtcbedto each of M 
; ; the rest are assigned to the background 
Rune CURRENT accord ng to C teep 50 best. 
Iterate to Expoint: 

Aid to CURRENT all children of mnfaers of CURRENT 

Rune CURRENT accord ng to C teep iVbest. 

;; Ni s reduced from20 to 5 as the sear ch proceeds . 



i M 2 andM 3 } 
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Egure 6-2: Inages used for Mtcling 



ftturn(QjRRENT) 



Sonatinas an extension of a natch wll produce one that is already in CUR- 
RENT, that wis reached in a different sequence of extensions. Mnenthis happens, 
the natches are coalesced This condition is effiiently detected by testing for near 
equalityof the scores of the itena inQjRRENT. Because the features are derivedfrom 
observations containing sona rancbmnoise, it is very unli My that tw> hypotheses 
having d Bering natches wll achieve the sana score, since the score is partly based 
on sunmad squared errors. 



622 Raiie Sodilfeilts 

The search nathod described in the previous section wis usedtoobtaingood natches 
in a cbnain of features that have significant fluctuations. The features wre derived 
fromreal inages. Alinear projectionmdel wis used 

Inages used for natcling are showiin Egure 6-2. The object rodel wis derived 
fromaset of f6inages, of wichthe inage onthe left is anexanple. Intlis set, only 
the 1 i ght source posi ti on vari ed The i nage features used i n the search wre deri ved 
fromthe inage onthe right. 

The features used for natcling wre deri ved fromthe edge naps showiin Egure 
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Egure 6-3: Kjge Mpe used for Mtchng 



6-3. The irage on the left shave the object mdel edges and the inage on the right 
show the irage edges. These edges are fromthe Canny edge detector [13] . The 
smothng standard deviation is eight pixels - these are lowresolution edge rape. 
The object mdel edges wrederivedfromaset of 16 edge rape, correspondirgtothe 
16 i rages described above. The object mdel edges are essentially the man edges 
wth respect to fluctuations induced by variations in lighting. (lowresolution edges 
are sensitive to lighting.) They are sirilar to the Man Kjge Inages described in 
Section 4. 4. 

The features used in natch ng are showi in Egure 6-4 These are point- radius 
features, as described in Section 5. 3. The point coord nates of the features are indi- 
cated by clots, wile the norral vector and curvature are illustrated by arc fragmnts. 
Each feature represents 30 edge pixels. The 40 object features appear in the upper 
picture, the 125 irage features lowr picture. The dstingii shed features used in the 
bootstrap of the search are indicated wth circles. The object features have been 
transf ornad to a newpose to insure generality 

The paramters that appear in the objective function are: 1$ the background 
probability and ^ the stationary feature covariance. These wre derived froma 
natch done by hand i n the exanpl e cbrai n The ori ented stati onary stati sti cs mdel 
of Section 3.3 ws used here. (Anorral mdel of feature fluctuations is implicit in 
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E gure 6- 5: lose Hi or used i n Search 



the objective function of Equation 6.8. This ras found to be a good rodel in this 
cbnain ) 

Al cose pose pri or wis used This pose prior is illustratedinEgure 6-5. The prior 
places the object in the upper left corner of the inage. The one standard delation 
internals of position and angle are illustrated The one standard deviation variationof 
seal e i s 30 percent . The actual pose of the obj ect i s wtli n the i nd cated one standard 
deviation bounds. This prior was chosen to demnstrate that the mtbodw>rls -veil 
despite a loose pose prior. 

The best results of the bearnsearch appear in Egure 6-6. In the upper inage, 
the object features are delineated wth heavy lines. They are located accord ng to 
the pose associated wth the best natch In the lover inage, the object features and 
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irage features are illustrated, -while the 18 correspondences associ ated wth the best 
natch appear as heavy lines and dots. 

The object features located according to the poses associated wth the five best 
natches are seeninHgure 6-7. The results are diffiult to distinguish because the 
poses are very si ml ar. 



6. 3 Search i n Rise Space 

Thi s secti on wl 1 expl ore searchi ng the NBtfobj ecti ve f uncti on i n pose space, (on 
nections to robust charier natcling wll be described 

Apose esti rate i s sought by orderi ng the searchi or raxi na of the NBtbbj ecti ve 
function as follow, 

/3=argnax rax IjT, fy . 

Substi tuti ng the obj ecti ve f uncti on f romEjuati on 6. 6 yi el ds 

/3^rgmx mx £ [\ 3 - l -(Y t -1\M „ /$) T ^(Y t -1\M „ 

The pose prior termhas been dropped in the interest of clarity It w>uldbe easily 
retained as an additional quadrati c term 

Thi s equati on ray be si mi i fied wth the f ol 1 owng defini ti on, 

D ik$ = 2 xT ^ x ■ 

Di 3 {^ nay be thought of as a generalized squared distance between observed and 
predicted features. It has been called the squared Mhalonobis distance [22]. 
The pose esti rator nay nowbe wi tten as 



/3=argnax rax ^ [\ 3 —D i 3 (Yi —T{M 3 , 

13 r ij-.pMj 
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Egure6-7: Bst Eve Mtchftsults 



or eqiivalently, as a riririzati on rather that raxinhzation, 



f3=argmn rim ^ 

13 r ij-.FMj 



[DiJiYi-TiM „/$)-A ,-j 



The sumis taken over those inage features that are assigned to rodel features 
(rot the background) in the natch. It ray be re-wittenin the fol loving ray, 



f3=argmn V]rim < 



if r t - =L 

D tJ (Y t -T(M J} ®)-\ tJ if T t =M 3 



or as 



/3=argrin V] rim(0, rin D t : 3 (Yi — V{M j, /$) — A j j) . 

P t . j 

If the correspondence rerardis independent of the mdel feature (this holds when 
oriented stationary statistics are used), A ij=A ;. In this case, A 8 - ray be added to 
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each termin the sumwthout affecting the riririzing pose, yielding the followng 
formfor the pose estinator, 

/3=argrin V] iin(A ,-, rin Di lYj — T(M ,, /3))) . (6-H) 

/J t . j 

This objective function is easily interpreted - it is the sum taken over inage 
features of a saturated penalty The penalty (before saturation) is the snail est gen- 
eralized squared distance fromtbe observed i rage feature to sona projected mdel 
feature. The penal ty rin jD; ^x—l\M j,fy) has the formof a \5ronoi surface, as 
described by Hittenlocber et. al. [42] . They describe a naasure of sirilarity on 
inage patterns, the Huscbrff distance, that is the upper envelope (naxirnn^ of 
\5ronoi surfaces. The naasure used here differs in being saturated, and by using the 
sumof \5ronoi surfaces, rather than the upper envelope. In their wxk, the upper 
envelope offers sona reduction in the conplexity of the naasure, and facilitates the 
use of nathods of corputational geonatryfor explicitly corputing the naasure in 2 
and 3 di nansi onal spaces. 

(Srputational geonatry nathods right be useful for corputing the objective 
function of Rjualian6.ll. In higher di nansi onal pose spaces (4 or 6, for exanple) 
POtree nathods ray be the only such techniques currently available. Betel has 
used POtree search algorithm in feature ratcling. 

Pfet a connect i on wll be sfowi between NBtSearchinpose space andanathod 
of robust charier ratcling. Erst, thecbrainof NBMs si rpliffed in the followng 
my Kill stationarity of feature fluctuations is assured (as covered in Section 3. 3). 
Erther, the feature ccvariance is assured to be isotropic. W;h these assurptions 
ve ha\e tp t : 3 =a 2 J, and D t : 3 = ^\ af 2 . Alditionally, assuring constant background 
probability, re have A ij =A The pose estinator of Rjuation 6.11 ray now be 
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wittenin the fol loving si rpl i fed f orr^i 



/3^rgrini JZ rimU rim . (t~2 I Y ~H M j, #1 2 )) • 



Mfei the projection function is linear, invertible, and distance preserving, (2D 
and3Dri gi dtransf orrati ons sati sf y these properti es) , the esti rator ray be expressed 
as follow, 



/3^rgrim £ rim(A rim . (-L | T\Yi ,# -M ,| 2 )) 

/3 • J Z(T Z 



This ray be further si rpl i fed to 



/3^rgrim £ rim(A <* 2 (^ _1 (^, /$)) , (6.12) 

P 



by usi ng the fol 1 owng defiri ti on of a nhri rnmdi stance f uncti on. 



c(x) = -^ — rim \ x— M j\ . (6.13) 



y/2a 



i 



(hariieri ng nathods ray be usedto tabul ate approxi rati ons of d (x) i nani rage- 

1 i ke array that is i ndexed by pi xel coordi nates . (hanfier- based approaches to i rage 
registration problem use the array to facilitate fast evaluation of pose objective 
functions. Brrowet al. [3] descri be an early rat hod where the objective function 
is the sumover mdel features of the distance fromtbe proj ected nodel feature to 
the nearest irage feature. Bbrgefors [8] recornands the use of H\S distance rather 
than surnad di stance i n the obj ecti ve f uncti on. 

ftcently Jiang et al. [47] described a rat hod of robust charier ratcling. In 
order to rake the ratbodless susceptible to disturbance by outliers and occlusions, 
they added saturation to the H\S objective function of Bffgefors. Their objective 
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function has the fol loving form 

3 

where d j is the squared distance fromtbe j'th projected rmdel point to the near- 
est imge point. Aide fromtbe constants and square root, which don't affect the 
nhrinhzing pose, this objective function is equivalent to Equation 6. 12 if the role of 
imge and mdel features is reversed, and the sense of the projection function is in 
verted Ji ang et al . showi npressi ve resul ts usi ng robust charier ratcli ng to regi ster 
nilti-radal 3DmcEcal imgery. 

6.4 Extensions 



Mdel Mtching perforna well on low resolution imgery in wich feature 
uncertainty is significant. It could he used to bootstrap a coarse- fine approach to 
mdel mtching, yielding good results wth reasonable running tinas. (Sarse-fine 
approaches have proven successful in stereo mtching applications. (See Gimon 
[33] and Brnard [2] .) A coarse- fine strategy is straightforward in the framwrk 
described here. In a hierarchy the pose esti rate fromsolving the objective function 
at one seal e i s used as a pri or for the esti rati on at the next . Hvi ng a good pri or on 
the pose wll greatly reduce the arount of searcli ng requi red at high resolution 

Ending a tractable mdel that incorporates pose dependent visibility conditions 
wjuI d be useful for appl yi ng NBMi n non vi ew based reccgri ti on 

6.5 Mated Work 

The HHRvision systemof Apache and Ktugeras [1] uses sequential linear- least- 
squares pose esti rati on as wll as the linear 2Dpoint feature and projection mdel 
descri bed i n Secti on 5. 2. IftFERi s descri bed as a search al gori thm Efferent cri teri a 



6.5. RELATED WHi 85 

are used to evaluate candidate ratcbes and to evaluate corpeting 'Stole" hypothe- 
ses. Ai ad hoc threshold is used for testing a continuous naasure of the mtrical 
consi stency of candidate natch extensions. Mable natch hypotheses are evaluated 
according to the amunt of inage feature accounted for - although not according to 
overall mtrical consistency. HHRw>rls -veil on real inages of industrial parts. 

Gad outlined a fiyesian strategy of natch evaluation based on feature and 
background stati sti cs i n hi s paper on automti c programing for rodel - based vi si on 
[29]. In lis system search was control led by thresholds on probabilistic masures of 
the reliability and pi ausi HI ity of mtcbes. 

low descri bes i n general terra the appl i cati on of Byesi an techni ques i n hi s book 
on Vsual recognition [51] . H treats the mnimzation of expected running tim of 
recognition. Inadditionhe discusses selection amng numrous objects. 

Obj ect reccgri ti on mtcli ng system often use a strategy that can be sunnari zed 
as a search for the mximl rat cling that is consistent. G)nsi stency i s frequently 
defined toman that the ratcli ng inage feature is wtlinfkite bounds of its expected 
position (bounded error rodel s). Cass' system[14] i s one exanpl e. Such an approach 
my be cast i n the f ramvork defined here by assuring uni f ormprobabi 1 i ty densi ty 
functions for the feature deviations. lose solution wthtlis approach is likely to be 
rare conpl i cat ed than the sequential linear- least- squares mt hod that can be used 
when feature deviations have norral rodel s. Cass' approach effectively finds the 
global optimmof its objective function. It perform veil on occluded or fragmnted 
real imges. 

Kveri dge, W ss and Rsemn [ 6] use an obj ecti ve f uncti on for 1 i ne segmnt based 
recognition that is sirilar to the one described here. In their w>rk, the penalty for 
deviations is quadratic, while the remrdfor correspondence is non- linear (exponen- 
tial) in the amunt of nhssing segmnt length (ry contrast, the reward descri bed i n 
tlis paper is, for stationary rodel s, linear inthe length of aggregate features. ) The 
trade- off paramters i n thei r obj ecti ve f uncti on vere deterrimed enpi ri cal 1 y Thei r 
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systemgi ves good perf ormnee i n a cbmi n of real i rages. 

Urns and Rsemn [ 12] and Urns [11] describe a classification based recognition 
system They focus on the use of descriptionnetwjrks for effnentlysearclingarong 
mi ti pi e obj ects wth a recursi ve i ndexi ng scbena. 

Hnson and fua [ 27] [ 26] descri be a general obj ecti ve f uncti on approach to i mge 
understanding. They use a rimnnmdescription length (ML) criterion that is 
desi gned to w>rk wth generic object mdels. The approach presented here is tailored 
for specific object mdels. 

6.6 Sunnary 

AMPmdel mtcling technique for visual object recognition has been described 
The resulting objective function has a sinple formvhen norml feature deviation 
mdels and linear projection mdels are used Experimntal results wre showi 
indicating that MPMdel Mtcling wrk Will in lowresolutionmtching, where 
feature devi ati ons are si gri ficant . ftl ated w>rk wis di scussed 



Chapter 7 

Posterior ]\4rgi nal Pose 
Est i nat i on 



In the previous chapter onMPMdel Altering the object reccgritionproblernras 
posed as an optinhzati on probl emresul ti ng froma statistical theory. In that forrn- 
lation, conjiete hypotheses consist of a description of the correspondences between 
inage and object features, as -veil as the pose of the object. The mthodras showi 
to provi de effect i ve eval uati ons of natch and pose. 

Theformlationof recognition that is describedinthis chapter, losterior Mrginal 
lose Bti rati on 1 (HvBE), builds onMPMdel Mtcling. It provides a s moth 
objective function for evaluating the pose of the object - wthout conritmnt to a 
particular natch. The pose is the rost important aspect of the problern in the sense 
that knowng the pose enables graspirg or other interactionwththe object. 

In tli s chapter, the obj ecti ve f uncti on i s expl ored by probi ng i n sel ected parts of 
pose space. The cbrainof these experi rents is features derived fromsynthetic laser 
radar range inagery and grayscale video iragery Alirited pose space search is 
perf orrad i n the vi deo experi rant . 

In Chapter 8 the Expectation- Mxirization (M)[ algorithmis discussed as a 



1 An earl y versi on of tli s w>rk appeared i n [ 76] 

87 



88 CHAPTER 7. PC6TEPJ CR MWU NAL PC8E ESTI MTICN 

raans of searchi ng for 1 ocal raxi ra of the obj ecti ve f uncti on i n pose space. 

AH ti onal experi rants i n obj ect recogri ti on usi ng the PMEobj ecti ve f uncti on 
are described in Chapter 10. There, the EMalgorithmis used in conjunction wth 
an indexing rat hod that generates initial hypotheses. 



7. 1 Obj ecti ve Rmcti on for Rise 

Thefollowngrathodwis mtivatedbytbeobservationthat inheuristic searches over 
correspondences wth the objective function of MP Mdel Mtching, hypotheses 
havi ng i nji ausi bl e ratches scored poorl y i n the obj ecti ve f uncti on The i nji i cati on 
wis that surnhng posterior probability over all the ratches (at a sped fie pose) right 
provi de a good pose eval uator . TEi s has proven to be the case. Athough i ntui ti vel y 
this right seemlike an odd ray to evaluate a pose, it is at least derocratic in that 
all poses are evaluated in the sara ray The resulting pose estimtor is smoth, 
and is aranable to local search in pose space. It is not tied to specific ratches - 
it is perhaps i n keeping wth Mrr's recornandation that corputational theories of 
vi si on shoul d try to sati sf y a pri nci pi e of 1 east cornhtrant [ 52] . 

AicE ti onal rati vati on was provi ded by the w>rk by Yi 1 1 e, (ki ger and HI thoff 
on stereo [ 78] . They di scussed corputi ng di spari ti es i n a stati sti cal theory of stereo 
where a rarginal is corputedover ratches. 

InMPMdel Mtching, joint hypotheses of ratch and pose wre evaluated by 
their posterior probability given an irage - j(T, f3 \ Y). T and /3 stand for cor- 
respondences and pose, respectively and yfor the irage features. The posterior 
probability ras built fromspecific mdels of features and correspondences, objects, 
and projection that wre described in the previous chapters. The present forrmla- 
tion wll first be described using the independent correspondence rodel. lie of the 
Mrkov correspondence rodel wll be described in the followng section 

lire w use the sara strategy for evaluating object poses: they are evaluated 
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by their posterior probability, gi ven an i rage: jj^/3 \ Y). The posterior probability 
densi ty of the pose nay be corputed f rorrthe j oi nt posteri or probabi 1 i ty on pose and 
natch, byfornally taking the narginal over possible retches: 

r 

In Section 6. 1, Equation 6. 4, j(T, f3\ Y) was obtained via Ryes' rulefromprob 
abilistic mdels of inage features, correspondences, and the pose. Substituting for 
j(T, f3\ I 7 ), the posterior narginal nay be witten as 



r 



jy\ r,/$^r,/$ 



(7.1) 



Ui ng equati ons 2. 1 (the i ndependent feature mdel ) and 6. 2, ve nay express the 
posterior narginal of /^interna of the conponent densities: 



m y 



i 



0) 



EE--En^l T^mTteft 



ri r 2 



or 



" ' ri r 2 r„ % 
BeaMng one factor out of the product gives 



& 1 ) ri r 2 r„ 



n-l 



nt^.-l U,&(r ,-)] 



i=l 



iY n \ Y n ^iY 



or 



& 1 ) ri r 2 rvi 



n-l 



nt^.-l C^r ,-)] 



i=l 



£^| i;,/^r 
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Gkiti nl ng i n si ml ar f ash on yi el ds 



mr>= ||-n 



r 8 



TPi s nay be wi tten as 



m vj 



M. 
^ 



■ru^-i $ 



(7.2) 



since 



Splitting the T 8 - surrdntoits cases gives, 

£Yi | /$ =^f ,- 1 i; =1, /$^r ,- =i) + J2ti Y * I E = M i> ^ r •■ = M 

Substituting the densities assured in the mdel of Section 6.1 inRjuations 6.1 and 
2.2 then yields 



(7.3) 



AY* | /$ = ^ ^ + £ G*,.(X -fl(M „ /$) ^ 



M, 



(7.4) 



Installing this into Equation 7.2 leads to 



m y 



B 1 B 2 ■ ■ ■ B &ft 

{W 1 W 2 - ■ ■ W) n iY) 



n 



i+E^z^ l -^^ tj (Y-HM^) 



M, 



m 



Bi 



h i n Secti on 6. 1 the obj ecti ve f uncti on for Bsteri or Mrgi rial lose fiti rati on i s 
defined as the scaledlcgarithmof the posterior rarginal rrobabi 1 i ty of the pose, 



4/$ =ln 



c 
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where, as before, 



B 1 B 2 ■ ■ ■ B =z =1 1 

C = 7777^ 7777^ 2 l^l 2 



Thi s 1 eads to the f ol 1 owng expressi on for the obj ecti ve f uncti on (use of a norral pose 
prior is assured) 



m=- \w o) T i>?(w )+E ln 



Wi- • • WIS j 

1 + L — b— G ^(^ -^ M ;> #) 

(7.5)" 



This objective function for evaluating pose h5potheses is a srooth function of the 
pose. Mtbods of continuous optirization ray be used to search for local naxira, 
although starting values are an issue. 

The first termi n the FMtEobj ecti ve f uncti on (Ejuati on 7. 5) is due to the pose 
prior. It is a quadratic penalty for deviations fromtbe norinal pose. The second 
termessentiallynaasures the degree of alignmnt of the object rodel wthtbeirage. 
It is a sumtakenover irage features of a smoth nonlinear function that peals up 
posi ti vel y when the pose bri ngs obj ect features i nto al i gnmnt wth the i rage feature 
in question The logarithric termwll be near zero if there are no rodel features 
close to the irage feature in question 

In a straigbtformrdirplemntationof the objective function, the cost of evalu- 
ating a pose i s (§rf)) , si nee i t i s essenti al 1 y a non 1 i near cbubl e sumo^er i rage and 
rodel features. 

7. 2 L£i rg the INdrkov Correspondence Mdel 

Mfei the Mrkov Correspondence rodel of Section 2. 3 is used instead of the in 
dependent correspondence rodel, the surnhng techniques of the previous section 
no longer apply Kcause of this, a corputationally attractive cl osed f ormf orml a 
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for the posterior probability no longer obtains. Mvertbeless, it wll be sbowi that 
the posterior probability at a pose can still be effiiently evaluated using dynamo 
programing. 

Rf erri ng to Ejuati on 7. 1, and usi ng the i ndependence of natch and pose i n the 
prior (di scussed in Section 6. 1), the posterior mrginal probability of a pose ray be 
written as follow, 

mY]= £ M SWIM . 

Ui ng Ejuati ons 2. 3 and 6. 1, 

ri(ri,r 2 )r 2 (r 2 ,r 3 )- • -^^r^i,^) 

This nay be re- written as follows, 

n n — 1 



R 1 ) T 1 T 2 .. . C 



.i =1 i =1 



(7.6) 



where 

Ci=iYi\ J},ft£T i) . 

lire, the dependence of con/3has been suppressed for notational brevity 

Pfat it wll be showithat j(/3| Y) nay be witten using a recurrence relation 

AfA *)= ^-£*»-i(r„Mr n ) , (7.7) 

were 

6 

and 

/i n+ i (a) = J2 h n(b)c n+1 (b)r n+1 (h,a) . (7.9) 
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Expand ng Ejuati on 7. 7 i n term of the recurrence rel ati on, 



my= ^E 



m 



r„ 



7 „ "■n-2(r ri _i)c ri _i(r ri _i)r ri _i(r ri _i, r„ 



Trs-l 



^Vi ^ -L ri J ? 



or 



ifi\y)= H- E ^- 2 (r„_i) f[ c^K^rVi,^ 



« =n — 1 



i^ai n usi ng the recurrence rel ati on, 






7 „ ^n-3(r„_2)c n _2(r n _2)r n _2(r n _2, r„_i) 



n c 8 (r 8 ) r n _i(r n _i,r n ) 



« =n — 1 



or 



^/3| Y) 



iY) 



n-l 



Y^ h n - 3 (T n -2) n c «( r «) II ^'(r„r 8+ i) 



Gkitining insiiilar fashion leads to 



i =n — 2 « =n — 2 



^1 V) 






n-l 



E Mr 2 )n^(r 8 )n^(r„r 8+1 ; 



T2T3. • -rT i =2 « =2 



andnowusing the base expression for /i 



iV h 



m y 



iY) 



E 



r 2 r 3 . . .^ 



^ Cl (r 1 )r 1 (r 1 ,r 2 



n-l 



II c «( r «) II r t (v t} v t+1 ] 



i =2 i =2 



or finally, 



^f^.., 



n-l 



II c «( r «) II ^'(r„r 8+ ^ 



Li =1 i =1 

whi ch i s the sana as Rjuati on 7 6. Hi s conpl etes the \eri ficati on of Rjuati on 7 7 
Pfet, a dynanhe progrannmng algorithmwll be described that effiiently eralu- 
ates an objective function that is proportional to the posterior narginal probability 
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of a pose. The objective function is ^Wj(/?| V) . The algorithmis a direct inple- 

mntati on of the recurrence defined i n Ejuati ons 7. 7, 7. 8 and 7. 9, that hi 1 ds a tali e 

of values of h ;(• ) f romtbe bottomup. N>te that h ,-(&) onlyhas tw> values, depending 

on whether 6=Lor not. In the foil owng description, the synhol T is used to stand 

for ananonymus mdel feature. H . .denotes array locations that store values of h ;, 

and if • , • , • ) is an access function, defined bel owf that accesses the stored values. 

; ; ; 16 e Eynani c Progr amri ng to eval uate PNPEwi th Markov Correspondence M>del 
Evaluate- Pose(/S) 

Hit ^E 6 (|U/$r ^ 1) 
Br i <-2 T> N-l 

h,- ± ^e 6 ^-i,W^^ -+i(M 
Return ( £ 6 H*JV-I, 6) (fa b, /$) 



; ; ; Lbfine the auxi 1 i ary fundi on G 
RETURN(j(y ,- | t$(ib)) 



; ; ; Access val ues of Hs toredinat ahl e. 
H*a,b) 

If b =L Return (H aJ _) 

Else Return (H aT ) 



The loop in EVALUATE- POSE executes (§r>) tires, and each tina through the 
loop does (tyrlfi evaluations of the sunnands, so the conplexity is (tyrri). This 
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has the sana complexity as astraightformrdirplemntationof the FMEobjective 
f uncti on when the Mrkov mdel i s not used (Ejuati on 7. 5) . 

The sunning technique used here was described by (heesenan [17] in a paper 
about using naxi mmentropy mthods inexpert system. 

7. 3 Rurge Irage Ekperi rait 

Ai experi rant investigating the utility of losterior Mrginal lose Etinationis de- 
scribed in this section AMtional experi rents are described in Oapter 10. 

The obj ecti ve f uncti on of Ejuati on 7. 5 wis sarpl ed i n a cbnai n of syntheti c range 
inagery The f easi bi 1 i ty of coarse- fine search mthods was investigated by sarpl ing 
smothed vari ants of the obj ecti ve f uncti on 

7.31 BqRcticncf EEtires 

The preparation of the features used in the experi rant is sunnari zed i n E gure 7-1. 
The features were oriented range features, as described in Section 5.4. Tw> sets of 
features were prepared, the "mdel features", and the "i rage features". 

The object mdel features were derived froma synthetic range inage of an MB 
truck that ras created usi ng the ray traci ng prograrnassoci ated wth the HL 0© 
Rckage [ 23] . The ray tracer was nodi fled to produce range i rages i nstead of shaded 
i rages. The synthetic range irage appears in the upper left of Egure 7-2. 

In order to si nil ate a 1 aser radar, the syntheti c range i rage descri bed above ras 
corrupted wth si nil ated laser radar sensor noise, using a sensor noise mdel that 
is described by Shapiro, ftinhold, and lark [62]. In this noise mdel, masured 
ranges are either validor anoralous. Mlidmasuremnts are norrally distributed, 
and anonal ous masuremnts are uri f ornhy di stri buted The corrupted range i rage 
appears i n E gure 7- 2 on the ri ght . T3 si ml ate post sensor processi ng, the corrupted 
irage ras "restored' via a statistical restoration nathod of Mnon and Wis [56] . 
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E gure 7- 1: Reparati on of Eatures 
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figure 7-2: Synthetic Kinge Inage, Nisy Range Inage, and Restored Range Inage 
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Egure 7-3: Mdel Katures, N>isy Ratures, and Inage Katures 
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The restored range imge appears in the lowr position of Egure 7-2. 

Qi ented range features , as descri bed i n Secti on 5. 4, were extracted f romtbe syn- 
tbetic range inage, for use as rndel features - and f romtbe restored range inage, 
these are calledtbe noi syf eatures. The features were extractedf romtbe range i rages 
in the followng mnner. Range discontinuities were located by threshold ng neigh 
bori ng pi xel s , yi el d ng range d sconti nil ty curves . These curves were then segmnted 
into approxirBtely 20- pi xel - 1 ong segmnts via a process of line segmnt approxi ra- 
ti on. The 1 i ne segmnts (each representi ng a f ragmnt of a range d sconti nil ty curve) 
were then converted i nto ori ented range features i n the f ol 1 owng mnner . The X and 
Y coord nates of the feature -vere obtained f romtbe man of the pixel coord nates. 
The norml vector to the pixels ras gotten via least- squares line fitting. The range 
to the feature ras esti rated by taking the man of the pixel ranges on the near side 
of the d sconti nlty This information ras packaged into an ori ented range feature, 
as descri bed in Section 5. 4. The rodel features are sbowiintbe first inage of Eg- 
ure 7-3. Each line segmnt represents one ori ented range feature, the ticks on the 
segmnts i nd cate the near si de of the range d sconti ml ty There are 113 such obj ect 
features. 

The noisy features, derived f romtbe restored range imge, appear in the second 
imge of Egure 7-3. There are 62 noisy features. Sana features have been lost due 
to the corruption and restoration of the range imge. The set of imge features ras 
prepared f romtbe noi syf eatures by rancbriiy deleting half of tbefeatures, transform 
ing the survivors accord ng to a test pose, and addng suffiient randonhy generated 
features so that | of the features are due to the obj ect . The 248 i rage features appear 
i n the tbi rd i rage of E gure 7- 3. 

7.32 Sifiirg Tie (fijecthe Entioi 

The obj ecti ve f uncti on of Ejuati on 7. 5 ras sanpl ed al ong four strai ght 1 i nes passi ng 
through the (knowi) location in pose space of the test pose. Qi ented stationary 
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statistics wre used, as described in Section 3. 3. The stationary feature covariance 
was esti rated f rorm hand natch done wth a muse bet wen the rmdel features and 
the noisy features. The background rate paramter _Bvas set to g. 

Sanples taken along a line through the location of the true pose in pose space, 
paral 1 el to the Aaxi s are showi i n E gure 7-4. TEi s corresponds to mvi ng the obj ect 
along the Xaxis. The first graph show sanples taken along a 100 pixel length (the 
inage is 256 pixels square). The second graph of Egure 7-4 show sanples taken 
along a 10 pixel length, and the third graph show sanples taken along a 1 pixel 
length The ^coordinate of the test pose is 55.5, the third graph show the peak of 
the obj ecti ve f uncti on to be i n error by about one t wnti eth pi xel . 

Sanpl es taken al org a 1 i ne paral 1 el to the //axi s of pose space are showi i n E gure 
7-5. This corresponds to a simultaneous change in scale and angular orientation of 
the object. 

Each of the above graphs represents 50 equally spaced sanples. The sanples are 
joined wth straight line segnants for clarity Sanpl i ng was al so done paral 1 el to the 
yard v axes wthsimlar results. 

The sanpl ing described in this section show that i n the experi rant al cbnaintbe 
obj ecti ve f uncti on has a proriment , sharp peak near the correct 1 ocati on Sora 1 ocal 
raxira are also apparent. The observed peak ray not be the cbnknant peak - no 
gl obal searcli ng ras perf ornad 

G)arse-Ene Sanpl ing 

Ail ti onal sanpl i ng of the obj ecti ve of Ejuati on 7. 5 ws perf ornad to i nvesti gate the 
feasibility of coarse- fine search techniques. Acoarse-fine search nathod for finding 
raxira of the pose- space objective function w>uld proceed as follow. leaks are 
initially located at a coarse scale. A each stage, the peakfromthe previous scale is 
used as the starting value for a search at the next (less srooth) scale. 

The obj ecti ve function was sroothed by replacing the stationary feature covari- 
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figure 7-4: Objective function Sarples Aong X-Gientedline Through list lose, 
lengths: 100 Hxels, lOHxels, 1 Hxel 



102 



CHAPTER 7. PC6TmCRMR(INALPC6EESTIMmCN 







Mu Probe 


of Ob 


jectiv 


e Functi 


on 








































30.0 - 


















































































































•^ 





























0.5 0.6 0.7 









Mu P 


obes 


of Ob 


ec 


;iv 


i Function 




















1 
































































25.0 " 
20.0 " 




























































1 






































































H 


I [ 


v^ 





































0.83 0.84 0.85 0.86 0.87 0.: 



.91 0.92 0.93 









Mu P 


obes 


of Ob 


ectiv 


e Function 


















































30.0 - 















































































































































































0.87 9 0.8 8 0.8 81 0.8S2 0.383 C 



85 0.886 0.887 



E gure 7- 5: Obj ecti ve Rnxti on Sanpl es Aong p Qi ented Ii ne Through 'Est lose, 
lengths: .8, . 1, and. 01 
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ance natri x ipi n the f ol 1 owng nanner : 

$<- tjj+tjj s . 

The effect of the smothing natri x tp s is to increase the spatial scale of the co- 

variance natri ces that appear in the objective function 

Robes along the Xazis through the knowi location of the test pose, wth various 
amunts of smothi ng are showi i n E gure 7- 6. The s mot hi ng natri ces used i n the 
probing were as follow, in the sana order as the hgures. 

nA|(.i) 2 , (.i) 2 , (io.o) 2 , (io.o) 2 ) , 

EIAK.Q25) 2 ,(-0S) 2 ,(2.5) 2 ,(2.5) 2 ) , 

and 

nA|(.01) 2 , (.01) 2 , 1.0, 1.0) . 

where DA|- ) constructs diagonal natri ces fromits argumnts. These smothing 
natri ces -vere deterrined erprically. (N> smothing ras perfornad in the fourth 
hgure. ) 

These smothed saqiing experirHiits indicate that coarse- fine search nay be 
feasible in this cbnain In Egure 7-6 it is apparent that the peak at one scale nay 
be used as a starting value for local search in the next scale. This indicates that a 
final 1 i ne search al ong the Xazi s coul d use the coarse fine strategy. It i s not suffii ent 
evidence that such a strategy wll w>rk i n general . h before, there is no guarantee 
that the lccatednaxinnnis the global naximm 
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E gure 7- 6: XRobes i n Snoot bed Obj ecti \e Emcti on 
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7. 4 V deo Irage Ekperi rait 

In this section, another experi rant wth the HvBE objective function is described 
The features are point- radius features deri ved f romvi deo i rages. Alocal search in 
pose space is carried out, and the objective function, and a sroothed variant, are 
probed i n the vi ci ri ty of the peak 

7.4.1 BqRcticncf EEtires 

The features used in this experimnt are the sana as those used in the MPMdel 
Mtcli ng correspondence search experi rent reported i n Secti on 6. 2. They are poi nt- 
radi us features , as descri bed i n Secti on 5. 3. The features appear i n E gure 6- 4. 

7.42 SeEndi in Bee Space 

Asearch ras carried out in pose space froma starting value that was deterrinedby 
hand The search wis irpleranted wth Powll's rathod [59] of nAtidimnsional 
non 1 i near opti nhzati on Povel 1 ' s rathod i s si ml ar to the conj ugate- gradi ent rathod, 
but derivatives are not used The line nhni nhzati ons -vere carried out wth Bent's 
rathod [59], wichuses successive parabolic approxi rati ons. The pose resulting 
fromthe search is illustrated in Egure 7-7. This result is close to the best result 
fromthe MP Mdel Mtchng correspondence search experi rant . That result is 
reproduced here inEgure 7-8. It is confiorting that these tw> substantially different 
search rathods (corttnatorial versus continuous) provi de si ril ar answers in, at least, 
one experi rent . 

7.43 Sifiirg Tfe (Bjecthe Entioi 

Sanples rere taken along four straight lines passing through the peak in the objec- 
tive function resulting fromthe search in pose space reported above. (In the range 
experi rent, sanpling wis done through the knowitrue pose.) The results are illus- 
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Egure 7-7: ftsilts of Search in lose Space 




Egure 7-8: Bst ftsilts f rornMP Mdel Mt cling Correspondence Search 
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E gure 7- 9: Robes of Obj ecti ve Rnxti on leak 



trated i n E gure 7- 9. The peak i n thi s data i s not as sharp as the peak i n the range 
experiment reported in the previous section This is likely due to the fact that the 
features used in the video experimnt are substantially less constraining that those 
used i n the range experi rant - via ch have good range i nf orrati on i n them 

Sanpling of the objective function wth smothing vas also performd, as in 
Section 7.3.2. 
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Smoothing ras performd at one scale. The smothing ratrix ras 

DA$(.03) 2 ,(.03) 2 ,(3.0) 2 ,(3.0) 2 ) . 

Robing, performd in the sana nanner as in Egure 7-9 ras perfornad on the 
smotbed objective function. The results are shown in Bgure 7-10. In comparison 
to the range i rage experi nant , 1 ocal mxi ra are rare of an i ssue here. Thi s ray be 
partly due to the background features here having rare structure than the rancbriiy 
generated background features used in the range inage experi nant. ftcause of this, 
anoralous pose esti rates (-where the pose corresponding to the global raxirnmof 
the obj ecti ve f uncti on i s seri ousl y i n error) ray be rare 1 i kel y i n thi s cbrai n than 
in the range experi nant. 

7.5 Mationto Rubust Ektimtion 

This section describes a relationship bet-veen HvBE and robust esti rati on ry 
simplifying the cbrai n a robust estimator of position is obtained A connection 
bet-veen the si npl i fed robust estimator and neural netrorks is discussed 
Consider the fol loving sinplifcations of the cbrai n 

• drop the pose prior 

• the obj ect has one feature 

• the inage is one- di nansi onal wthwdth W 

• the pose is a scalar 

• the projection function translates: f(- , fy =/3 
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E gure 7- 10: Robes of Snoot bed Obj ecti ve Emcti on 
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Wh these si ratifications, the observationmdel of Equation 6. 1 beconas 



iY t \ rj = 



j_ 

w 



if 1} =-L 
G a {Yi —ft otberwse 



where 



G *( x ) = ttv ex P (-^TJ 



In tli s sirpli fied cbmi n T my be i nterpreted as a col 1 ecti on of vari ali es that de- 
scribe the validity of their corresponding masurenants inK Thus, T 8 - /=l_mybe 

interpreted as mating that Y 8 - is valid, andT ,- =l_as Y ,- bei ng i nval i d ]iY 8 ) is defined 
to be zero outside of the range [ ~2~i "y] • 

The prior on correspondences of Equation 2. 2 takes the followng form 



At. 



b if r ,- =i_ 

I —B otberwse 



UingByes' rul e and the independence of Y ,- and fallow the followng probability 

of a sarnie audits validity, 



iYi,Ti\ ft=iY i\ T^ftiT 



B_ 

W 



if r 8 =j_ 

1-$G a (Y t -ft otherwse 



The probali 1 i ty of a sanpl e my nowbe expressed by taki ng a mrgi nal over the 
probability in Ejuati on 7. 10, as follow, 

AYi I /$ = J2i Y ^ I $ = § Mi-m AY -ft . 

1 i 

IMki ng an obj ecti ve f uncti on as a 1 qg 1 i kel i hood of f3 



(7.10) 



%ft =ln 



UiYi\ ft 
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leads to the anal eg of the FMEobjective function for this si rpl i lied cbnai n, 

B 



i 

TH s ray al so be wi tten 



^l-#G .(Yi-ft 



(7.11) 



m= £^.--# (7-12) 



where 

$$ =ln 



^<l-BG a tf 



This is the Mxi rnmli Mi hood objective function for esti rating the man of a 
norral population of variance a 2 , that is contanhmtedwthauniformpopulationof 

wdth \Y -where the fraction of the rirxture due to the uniforrnpopulationis B 

The function ^a) is approxinately quadratic when the residual is snail, and 
approaches a constant -when the residual is large. Maenigoes to zero, S^ij beconas 
quadratic, and the estinator beconas least squares, for the case of a pure norral 
population. Maen — £(a) is viewed as a penalty function, it is seen to provide a 
quadratic penalty for snail residuals, as least squares does, but the penalty saturates 
when resi dual s becona large. R)bust esti rati on is concerned wth estimators that 
are, like this one, less sensitive to outliers that least squares, h wth rany robust 
estimators, the resulting optinhzation problemis mre diffiult than least squares, 
since the objective function is non- convex. This estinator falls into the class of re- 
descending Mesti nators as di scussed by Hfcr [41]. 

HvBEis soravhat different fromrobust esti rati on in that the saturating aspect 
of the objective function not only decreases the influence of "outliers" (by analogy, 
the background features), it also reduces the influence of inage features that don't 
correspond to (are not close to) a given object feature. 
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7.5.1 (cfrBctiaito Mid MwakSgridEntiai 

There is an important connection betwen the estirator of Equation 7. 12 and the 
si gmi d f uncti on of neural net rorks , 

1 -+exp(— a) 

The si grind function is a srnoth variant of a logical swtcling function that has 
beenusedfor mdeling neurons. It has been used extensively by the neural netrork 
connnnity in the construction of networks that classify and exhibit sona forna of 
learning behavior. The ffiKalk neural netrork of Sejnowski and Rsenberg [ 61] is 
a -vel 1 knowexanji e. 

It turns out that, under sona conditions on the paramters, the si gmi d function 
of x 2 is approxirately equal toS[a), ignoring shifting and scaling. This near equality 
is illustrated in Egure 7-11. 

The tw> functions that are plotted in the figure are 

f(A onlJ 2x to i /n ln[.25+.75exp(-a; 2 )] 
f(x) =2.{\ix ) -.5J and g(x) = — — . 

The upper graphsbow j{x)cnl^x) plottedtogether, while the lover graphsbow 
their difference. It can be seen that they agree to better than one percent. 

Kcause of this near equality for a special case of the paramters, anetwsrkthat 
evaluates the MI estirator of Equation 7. 12 for a cont animated nor ml population 
wl 1 have the f ormi 1 1 ustrated i n E gure 7- 12. 

This netrork, wth its arrangemnt of si gmi d and sumurits seem to fit the 
defini ti on of a neural netrork 

The robust estirator of Equation 7. 12, audits neural netw>rkapproximtion, are 
(approxirately) optiral for locatirg a Qussi an cluster i n uni f ormnoi se. 

Asirilar neural netrork realization of the EVBEobjective function w>uldlike- 
wse be near optiral for locating an object against a uni f ormbackground 
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Egure 7-11: j{a) and 5(a), and j{a) —5(a) 
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Egure 7-12: Mtw>rkIrplemntationof MPBtimtor for GbntarirBtedNrral 
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7. 6 HvE Efici ercy Bburri 

This section provides a lover bound onthe covariance natrixof the E\HEesti nator. 
Btimtors of vector paranaters (lite pose) nay be characterized by the covariance 
natrix of the esti nates they produce. The Gamr-Ro bound provides a lover 
li lit for the covariance natrix of unbiased esti nators. Inliased estinators that 
achieve this bound are call efficient estinators. Dscussions of estinator effiiency 
and Gamr- Ro bounds appear i n [ 63] and [ 72] . 

The Gamr-Ro bound on the covariance natrix of estinators of f3 based on 
observations of Jfis given by the inverse of the Esher infornati on natrix, 

«X ft >i F \ft . 

Hre, CQ(- ) denotes the covariance natrix of the rancbmvector argumnt. This 
natrix inequality mans that GX( ft —I p (ft is positive sem- definite. 

The Esher infornati on natrix is defined as follow, 

I F (ft=E x ([V p \ni^ ft][V plnjW ft] T ) 

where V p is the gradient wth respect to $ which yields a col urn- vector, and E x(' 

stands for the expected value of the argumnt wth respect to j^Jf. 

The covariance natrix, and the Gamr-Rao bound, of the E\BEesti nator are 
diffiult to calculate. Instead, the Gamr-Rao bound and effiiency wll be deter- 
nfaned f or estinators that have access to both observed features Y ;, and the corre- 

spondences T i. The Garar- Rjo bound for these "corplete-data" estinators wll be 
found, audit wll be showithat there are noeffiient comlete-data estinators. Be- 
cause of this, the EMEesti nator is subject to the sara bound as the corplete-data 
estinators, and the HvBEestinator cannot be effiient. This follow, because the 
EMEesti nator can be consi dered to be techri cal 1 y a corpl ete- data esti nator that 
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ignores the correspondence data. 

In terra of the conpl ete- data estimtor, the Bsber informtionhas tbefollowng 
forrn 

M$=E y^tV^ln^lfrl #][V plniXT] #] T ) . (7.13) 

Asuring independence of feature coordinates and of correspondences, the prob- 
ability of the conpl ete- data is 

i\r\ # = n^ r .-l $ • 

i 

Ui ng Ifcyes rii e and the i ndependence of V and $ 

^Yi,Ti\ ft^{Y t\ C^r ,-) . (7.14) 

Inferring to Equations 6. 1 and 6. 3, and using constant background probability^ 
and linear projection, the conpl ete- data corponent probability ray be witten as 
follow, 

if n =± 



AYi, r 8 1 /$ = 



WiW 2 . . . w 

l-B 



G^iYi-Mjft ifr,-=M 



m ^iA'i ~ 1V1 M 1L L l =1V1 3 ■ 

Wki ng tomrds anexpressi onf or the E sber i nf orrati on, ve di fferenti ate the conpl ete- 
data probability to obtain 

v,in^r| /$ =v ,inn^,r 8 1 /$ = E V ^r, r [ '/ ■ 



fef i =L, V fj]J{Yi,Ti I /3) =0, otberwse, in the case T ,- =M 



v^y,-,r t - 1 /$ =v ^L-^-g^.(k- -m $ 

m 
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Bfferentiating the nornal density (aforrdafor this appears in 8. 3), gives 



v^y,-,r,-| # 



\-B 



m 



G<,iA Y i- M 3 M J*7}(Yi-M £ 



so 



that 



^I^^J^-M^ vten T ,-=M 



Then the gradi ent of the corpl ete- data probabi 1 i ty ray be expressed as 



v^injorri /$ 






Mte that setting this expression to zero defines the Mxirnrnlikelihoodestirator 
for /3inthe conpl ete- data case, as follow: 



£ <#?' 






or 

This estirator is linear inK The inverse has been assured to exi st -it wll exist, 
pxovidedcertainlinear independence conditions are rat, and enough correspondences 
to mdel features appear in the natch This typically requires tw> to four correspon- 
dences in the applications described here. 

Returning to the Esher intonation, ve need to evaluate the expectation 



(715) 



I F =E 



y, r 



£ MfKfc 



£ MjKh 



jr-r=M, 



jr-r=Mj 
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where the if th resi dual has beenwittenas follow, 

e %3 =Y % -M 3 (3. 
ft- nanhng i ndi ces and pul 1 i ng out the sum gi \es 

/F=Ey,r( E E MfyTfrdi'rtj'Mi) • 

Inferring to Equation 7. 14, the expectation ray be split and ro\ed as follow, 

/F=Er( E E M^r/Ey|r(e,-^)^M J v) . 



/ y irJ -j r 8 j ^r |i v^-8 f-t'j'jYi'j' 



The inner expectation is o\er nntually independent Qussian randomvectors, and 
equals their comriance ratrix -when the indices agree, and is zero otherwse, so 



^= E r E E MfV^/^WvM'' 



This expression simplifies to the followng: 



/ F =E r e m/VtM- 



v i j : F=M 



The sumnay be re- wi tten i n the f ol 1 owng ray by usi ng a del ta f uncti on conpari ng 



Ti and M 3 , 



I F = EEr (Sr iMj ) Mj^M, = ^E r , (^ M J Mf^ Mj . 
* i * i 

The expectation is just the probability that an inage feature is ratched to sora 
mdel feature. TM sis — - , so the E sher i nf orrati on ray be wi tten i n the f ol 1 owi 
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simple forma 



If = Y^mJ^-}m 3 



or as, 



Iw=(l 



* j 



m 



m 



m 



j ^ij J ' 



* i 



This is an attractive result, andray be easily interpreted, i n relation to the Esber 
infornationfor the pose when the correspondences are fixed (a standard linear esti- 
mator). The Esber infornationintbal case is J2i 3 ~^J x\)~} M v '\i ray be interpreted 
as the sumover matches of the per- natch Esber inforration 

In light of this, the complete-data Esber inforration is seen to be the average 
of the per- rat ch Esber inforration, nnlti plied by the expected nunher of features 
ratcbed to the mdel , (1 —t$n 

Ai effiient unbiased estimator for the complete- data exists if and only if 



This requires that the right band side be independent of fi, since the estimator 
(Ejuation 7. 15) is not a function of f3 Expanding the right band side, 



P 



P+ 



m 



1 



m 



T.Mj^M : 



* i 



£ MfftXYi-Mjft 



ij-.FMi 



This is not independent of f3 Qe ray to see this is to note that the factor ml ti plying 
f3 in the second termis a function of T. Thus, no effiient estimator exists in the 
complete- data case, and consequently no effiient estimator exists for HvBE 

7.7 Mated Wk 



Geen [31] and Geen and Shapiro [32] describe a theory of Mxi mnmE kel i hood 
laser radar range profiling. The research focuses on statistically optimal detectors 



120 CHAPTER 7. PCBTEPJCRMRdNAL PC6E ESTIMUICN 

and recognizers. The single pixel statistics are described by a lirxture of uniform 
and norml conponents. Ringe profiling is inplemnted using the EMalgorithm 
Inder sona cir curat ances, least squares provides an adequate starting value. A 
continuation- style variant is described, where a range accuracy paranater is varied 
betwsen EMxmvergenees froma coarse value to its true value. Geen [31] corputes 
Gamer- Ro bounds for the conpi ete- data case of Mxi mmli kel i bood range profil e 
estirator, and corpares sinnlated and real-data perforrance to the lirits. 

Gss [16] [15] describes an approach to vi sual object recognition that searches 
in pose space for maximal alignnants under the bounded error model. The pose- 
space objective function used there is piecewse constant, and is thus not amnable 
to conti nuous search nathods. The search i s based on geonatri c f ormnl ati on of the 
constraints onfeasible transformations. 

There are sona connections betwsenFMIEand standard nathods of robust pose 
estimation, like those described by Hralick [38] , and Kmar and Hnson [ 48] . B)th 
can provide robust esti rates of the pose of an object, based on an observed i rage. 
The rai n di fference is that the standard nathods require specification of the feature 
correspondences , vti 1 e FMEdoes not - by consi deri ng al 1 possi bl e correspondences . 
FMErequi res a starting value for the pose (as do standard nathods of robust pose 
esti rati on that use non- convex obj ecti ve f uncti ons) . 

A nantioned above, ^lle, Giger and Mthoff [78] discussed computing dis- 
parities in a statistical theory of stereo where a marginal is computed over matches, 
^lle extends this technique [79] to other domains of vision and neural netw>rks, 
among themwnner- take- all netrorks, stereo, long-range motion, the traveling sales- 
man probl emu deferrable template matching, learning, content addressable memo- 
ries, and models of brain development. In addition to computing marginals over dis- 
crete fields, the Gbbs probability distribution is used This facilitates continuation- 
style optinhzation nathods by variation of the temperature paranater. There are 
sona si mal ari ti es between tli s approach and usi ng coarse- fine wth the HvBEobj ec- 
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tive function. 

Ebl nan and loggi o [ 24] descri be a ratbod of 3Drecogni ti on that uses a trai ned 
Gkeral i zed RicE al Rsis rUnctionnetw)rk Their ratbod requires correspondences 
to be known during training and recognition Qe sirilarity between their schena 
andLMEis that both are essentially arrangemnts of smothuri nodal functions. 

There is a sirilarity between Losterior Mrginal lose Btination and LLugh 
transform(Hl) ratbods. Ibugbly speaking, HTnatbods evaluate paramters by 
accunnl at ing votes in a discrete paramter space, based on observed features. (See 
the survey paper by IllingvorthandKttler [44] for a discussion of LLugh mt bods. ) 

In a recognition application, as descri bed here, the LIT ratbod w>ul devaluate a 
discrete pose by counting the nunfaer of feature pairings that are exactly consistent 
soravbere wtbinthe cell of pose space. A stated, the ED? ratbod has diffiulties 
wth noisy features. This is usually addressed by counting feature pairings that are 
exactly consistent soravbere nearby the cell in pose space. 

The utility of the ED? as a standalone ratbod for recognition in the presence of 
noise is a topic of sora controversy This is discussed by Giraonin [34] , pp. 220. 
Lerhaps thi sis due to an unsui tali e noi se mdel i npl i ci t in the LLugh Transform 

losterior Mrginal Lose Btination evaluates a pose by accumulating the loga- 
rithmof posterior mrginal probability of the pose over irage features. 

The connecti on betveen lTTratbods and pararater eval uati on vi a the 1 ogari thm 
of posteri or probali 1 i ty has been descri bed by Stephens [ 67] . Stephens proposes to cal 1 
the posterior probability of paramters given irage observations "The Lrobalilistic 
LLugh Tansforrfi. Ik provided an examle of esti rating line paramters from 
irage point features -whose probability densities vere described as having uniform 
andnorral corponents. Lfe also states that the ratbod has been used to track 3D 
objects, referring to his thesis [68] for definition of the ratbod used 
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7.8 Sunnary 

Amthodof evaluating poses in object recognition, Rsterior Mrginal Rse Bti ra- 
ti on, has been descri bed The resul ti ng obj ecti ve f uncti on was seen to hare a si npl e 
f ormvben nornal feature deviationrndels and linear projectionmdels are used 

Iimtedexperinantal results were shown nd eating that inaebnainof synthetic 
range d sconti nl ty features , the obj ecti \e f uncti on nay ha\e a prorinent sharp peak 
near the correct pose. Sona local raxira wre also apparent. Aether experimnt, 
i n wi ch the features wre deri \ed f romvi deo i rages , wis descri bed Gmnecti ons to 
robust esti rati on and neural networks wre exanhned B)unds on the perf orrance of 
si npl i fed FMEesti rators wre i nd cated, and rel ati on to other w>rk wis d scussed 



Chapter 8 

Expect at i on — ]\4xi m zat i on 
M gor i t hm 



The Expectation- Mxirization (Bvjt algorithmras introduced in its general form 
byhfcpster, Eufi n and Eai rd i n 1978 [21] . It is often useful for corputing estimates 
incbrains having tw> sample spaces, where the events inone are unions over events 
in the other. This situation holds among the sanple spaces of Posterior Mrginal 
lose Bti rati on (EME) andMPMdel Mtcling. In the original paper, the wde 
generality of the EMalgorithmis discussed, along wth several previous appearances 
in special cases, and convergence results are described 

In this chapter, a specific formof the EMalgorithmis described for use wth 
FME It is used for hypothesis rehnenant in the recognition experimnts that are 
described in Chapter 10. Issues of convergence andinplemntaticnare discussed 



8. 1 Efefiii t i on of EMI t erat i on 

In this section a variant of the EMalgorithmis presented for use wth Fosterior 
Mrginal Pose Estimation, wichwis describedin Chapter 7. Thefollowngmdeling 
assumptions wa*e used N>rnal rodels are used for ratchedinage features, wile 

123 
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uri f orrrmdel s areusedfor unratched(baclground) features. If a prior ontbeposeis 
available, it is norml. The i ndependent correspondence mdel is used Additionally 
a linear mdel is used for feature projection 

In RvffiJ the pose of an object, fi, is esti rated by naxirizing its posterior 
probability, gi \en an i nage. 

/3=argrax j(/3| Y) . 

Anecessary condition for the naxinnrris that the gradient of the posterior prob 
ability wth respect to the pose be zero, or equiralently that the gradient of the 
logarithmof the posterior probability be zero: 

0=V ^ln^/31 Y) . (8.1) 

In Secti on 7. 1, Ejuati on 7. 2 the f ol 1 owng f ornil a wis gi \enf or the posteri or prob 
aH 1 i ty of the pose of an obj ect , gi ven an i nage. Thi s assures use of the i ndependent 
correspondence mdel . 

Irposingtbe condition of Equation 8.1 yields the fol loving, 

0=V p ln-i-^n^ /$+ ^ln^F 8 |/$ 
or 

0= mA +vMiyi. (&2) 

AH i AY,\ii 

A in Equation 7.3, w ray wite the feature HF conditioned on pose in the 
foil owng my, 

AY A ® = J2AYt\ R/^r ,-) , 

or, using the specific rodels assunad in Secti on 7. 1, as reflected in Equation 7. 4, and 
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usi ng a 1 i near proj ecti on rodel , 

The zero gradient condition of Equation 8. 2 ray nowbe expressed as follow, 



Q _ VM /$ y ? E,- V^ G^. (^ -M ,-fl 

1th a nornal pose prior, 

^/$ =G i, p (P-P ) , and V ^/$ =^/$V ^(/3"/3 o) 

The gradient of the otter norral density is 



V fj G^(Y t -M^=-G ^(Yi-MtfMfftKYi-Mtf . (8.3) 



Kturring to the gradient condition, and using these expressions (negated) 



1 \VxW2- ■ - v w ^T" Ej (j^ij (Yi —M jP) 

E nal 1 y the zero gradi ent condi ti on ray be expressed corpactl y as f ol 1 ovb , 

0=^ ^(P-P o) + J2W t3 Mf^(Y t -M^ , (8.4) 

wthtbe followng definition 

Gt-XYi-Mth , s 

W ii=-Bi *~ ■ ( &5 ) 

i-Bi WxW 2 - ■ -vW Ej ^ipij^Yi —M jfij 

Ejuati on 8. 4 has the appearance of bei ng a 1 i near equati on for the pose esti rate f3 

that satisfies the zero gradient condition for bei ng a raxi rum Iffortunately it isn't 
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a linear equation, because W t \ 3 (the 'Heights") are not constants, they are lunctions 
ol fi T> find solutions to Equation 8. 4, the BVMgorithmiterates the foil owng tw> 
stepe: 

• Treating the -veights, W 8J as constants, solve Equation 8.4 as a linear equation 
for a newpose esti rate fi Tbi sis referred to as the Mstep. 

• liing the rost recent pose esti rate fi, re- evaluate the 'weights, W i x accordu 
to Equation 8. 5. This is referred to as the Estep. 

The Mstep is so narad because, in the exposition of the algorithmin [21] , it 
corresponds to a Mxirnrnlikelibood esti rate. A di scussed there, the algorithm 
is also amnalie to use in MPfornilations (like this one). Hre the Mstep corre- 
sponds to a MP esti rate of the pose, given that the current esti rate of the -veights 
is correct. 

The Estep is so narad because calculating the W i 3 corresponds to taking the 

expectation of sonarancbmvariables, gi vent he i rage data, and that the met recent 
pose esti rate i s correct . These rancbmvari abl es have valuelifthez'thi rage feature 
corresponds to the j'th object feature, and otherwse. Thus, after the iteration 
converges, the -veights provide continuous- valued esti rates of the correspondences, 
that vary bet-veen and 1. 

It seena soravbat ironic that, having abandoned the correspondences as being 
part of the hypothesis in the fornnlation of PME, a good esti rate of themhas 
re- appeared as a byproduct of aratbodfor search in pose space. This esti rate, the 
posterior expectation, is the riri nnmrari ance estirator. 

Sing essentially a local ratbod of non- linear optinhzation, the EVlalgorithm 
needs good starting values in order to converge to the right local rBxirnm It ray 
be started on either step. If it is started on the Estep, an initial pose esti rate is 
required Mien started on the Mstep, an initial set of -veights is needed 

Ai initial set of -veights can be obtained froma partial hypothesis of correspon- 
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dences in a sirple ranner. The weights associated wth each set of corresponding 
features in the hypothesis are set to 1, the rest to 0. Indexing ratbods are one source 
of such hypotheses . In Chapter 10, Aigle lair Indexing is used to generate candidate 
hypotheses. In this scenario, i ndexi ng provi des initial alignrants, these are refined 
usi ng the EVfel gori thru then they are veri fed by exanhri ng the val ue of the peak of 
the FMtEobjective function that the refinemnt step found 

8. 2 Convergence 

In the original reference [21] , the Bvfelgorithmras showito have good co-mergence 
properties under fairlygeneral circumtanees. It is showithat the likelihood sequence 
produced by the algorithmis ronotonic, i.e., the algorithmne-ver reduces the value 
of the objective function (or in this case, the posterior probability) fromone step to 
the next. W[77] claim that the convergence proof in the original EMreference is 
Saved, and provi des another proof, as -veil as a thorough discussion It is possible 
that it wll mnder along a ridge, or becora stuck in a saddle point. 

In the recognition experi rants reported in Chapter 10 the algorithmtyrically 
converges in 10 - 40 iterations. 

8.3 Inpienantati on Issues 

Sora thresholding natbods rere used speed up the corputation of the Eand M 
steps. 

The -veigbts W 8J provide a raasure of feature correspondence, h the algorithm 
operates, met of the -veigbts have values close to zero, since met pairs of inage and 
object feature don't align -veil for a given pose. In the corputation of the Mstep, 
met terra -vere left out of the sumbasedonathresboldfor W 8/ Som representative 

-veigbts f roman experi rant are di spl ayed i n Hbl e 10.1 in Chapter 10. 

IntheEstep, met of the rorkis in evaluating the Caussianf unctions, -vhichhave 
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quadratic form in them Br the reason stated above, met of these expressions have 

values very close to zero. The evaluation of these expressions was mde conditional 

on a threshold test applied to the residuals Y i —Mjft Maen the (x,y) part of the 

residual exceeded a certain length, zero was substituted for the value of the Gussian 

expressi on Tali es i ndexed by i rage coordi nates right provi de another effect i ve way 

of inplemnting the thresholding here. 

The value of the FME objective function is corputed as a byproduct of the E 
step for little additional cost. 

8.4 Mated Wk 

The w>rk of Geen [ 31] and Geen and Shapi ro [ 32] that i s di scussed i n Secti on 7. 7 
describes use of the EVfelgorithmina theory of laser radar range profiling. 

Ii peon [50] deecribee anonetatietical mtbodfor refining alignmnte that iteratee 
eolving linear system. It mtches rodel features to the nearest imge feature under 
the current pose hypothesis, -while the mt hod described here entertains mtches to 
all of the imge features, weighted by their probability Iipeon'e natbodras showi 
to be effective and robust i n an inji enact at ion that refines alignnants under linear 
(onbi nation of Vew. 



Chapter 9 



Angl e Pai r Indexi ng 



9.1 Description of Mthod 

Agle lair Indexing is a simple mthod that is designed to reduce the araunt of 
search needed in finding rat cbes for inage features in2Dreccgrition. It uses features 
havi ng 1 ocati on and ori entati on. 

Aiinrariant property of feature pairs is used to index a table that is constructed 
ahead of tira. The property used is the pair of angles bet-veenthe feature orientations 
and the line joining the feature's locations. These angles are 9 i and 9 2 inEgure 9-1. 

The pair of angles is cl earl y i nvari art under translation, rotation, and scaling in the 
plane. 

Uing orientations as -veil as point locations provides rare constraint than point 
features. Kcause of this, indexing ray be performd on pairs of sinple features, 
rather than groups of three or rare. 

The table is constructed fromthe object features in a pre-processing step It is 
indexed by the angle pair, and stores the pairs of object features that are consistent 
wth the value of the angles, wtlin the resolution of the table. The algorithmfor 
constructing the table appears below 

AcE stance threshold is used to suppress entries for features that are very close. 
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E gure 9- 1: Agl es for Indexi ng 

Such features pairs yi el d si oppy i ri ti al pose esti rates and are poor initial hypotheses 
for recognition. 



; ; ; G ven an array rodel- features arid a t ahl e si ze, n 
; ; ; £1 1 s i n the 2 i ndex array ANGLE- PAIR- TABLE by si de- effect . 
Bui LD- ANGLE- TA.BLE(mdel-features, n, distance- threshold) 
m^-LENGTH(mdel-features) 
;; First clear the table. 
K>r i <-0 T> m 

Br j <-0 T) m 

Angle- Pai r- Table[1, j] H3 
; ; Nowhl 1 i n t he t abl e ent ri es . 
Br i <-0 T> m 

Est j <-0 T3 m 

H i H 

fl <^udel-features[i] 

f2 <^udel-features[j] 
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If DlSTANCE(fl, f2) > d stance- threshold 

< qr >-CaLCULATE-InDI CES(fl, f2, n) 

ANGLE- PAI R- TABLE [q, r] ^-ANGLE- PAI R- TABLE [q, r] U <i j > 

The followng function is used to calculate the table indces for a pair of features. 
Mte that the indexing wape around wen the angles are increased by % This 
wis done because the features used in the recognition experimnts described in this 
research are often straight edge segmnts, and their orientations are anMguous byx 

; ; ; Cal cul at e i ndi ces i nt o ANGLE- PAI R- TABLE for a pal r of feat ures . 
Calculate- I ndi ces (f 1, f 2, n) 
S9<- v - 

n 

i ^L|Jn»dn) 
J HLflJnxIn) 
return(<i j ^ 

The followng algorithms used at recogrition-tira to generate a set of pairs of 
correspondences froninage features to object features that lave consistent values of 
the angle pair inwiant. The indexing operation sa\es the expense of searching for 
pairs of object mdel features that are consistent wth pairs of i rage features. Table 
entries fromadj acent cells are included amng the cand dates to accomodate angle 
values that are "on the edge" of a cell boundary. 

;;; Mp over the pal rs offeaturesinanirmgeandgenerate 
;;; candidate pairs of feature correspondences 
GENERATE- Ovndi DATES (inage- features, n) 

cand dates <— 

m<— LENGTH(i nage-f eat ures) 
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Br i <— T> m 

Br j <-4 +1 torn 

<qr >^-C\LCULATE- I NDI CES (inage-features[i] , inage-features[j] , n) 
Br fiq < — 1 to 1 

Br Sr < — 1 to 1 

Br <kl >£ .ANGLE- Pair-TABLE[ ((<?+<%) mdn), ((r+Sr) rodn)] 
candi dates <— candi dates U «i k ><j 1 » 
let urn( candi dat es ) 



9. 2 Sparsi ficat i on 

In the recognition experi rants described below and in Section 10. 1, an additional 
technique ras used to speed up recognition tire processing, and reduce the size ol 
the table. A the table was built, a substantial fraction of the entries wxe left out 
of the table. These entries wxe selected at random This strategy is based on the 
followng observation Br the purpose of recognizing the object, it is only necessary 
for sora feature pair f rorrthe obj ect tobebothinthetableandvisibleintleirBge. If 
a reasonabl e f racti on of the obj ect i s vi si bl e, a substanti al nunfaer of feature pai rs wl 1 
be available as potential partners in a candidate correspondence pair. It is unlikely 
that the corresponding pairs of object mdel features wll all have been rancbnlry 
elimnated when the table was built, even for fairly large arounts of sparsi ficati on 

9.3 Mated Wk 

Indexing based on invariant properties of sets of inage features has been used by 
Iandan and Wf son, i n thei r wxk on georatri c hash ng [ 49] , and by (J enans and 
Jacobs [19] [20] , Jacobs [45] , and Thorpson and Mndy [70] . In those cases the 
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inwianee is wth respect to affile transforrations that hare eight paranaters. In 
this w)rk the inwianee is wth respect to translation, rotation, and scale in 2D, 
where there are four paranaters . Thonpson and Mndy descri be an i nwi ant cal 1 ed 
vertex pai r s . These are based on angl es rel ati ng to pai rs of \erti ces of 3Dpol yhedra, 
and thei r proj ecti ons i nto 2D Agl e lai r Indexi ng i s somwhat si ril ar , but i s si npl er 
- bei ng desi gned for 2Df rorrfiDreccgri ti on 

Gemns and Jacobs [19] [20], and Jacobs [45] use grouping nachanisrH to select 
snail sets of inage features that are likely to belong to the sana object inthe scene. 
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Chapter 10 



Recogni t i on Exper i rnent s 



This chapter describes several reccgri t ion exper i rents that use losterior Mrginal 
lose Btiration wth the EMAgorithm The first is a conplete 2D recognition 
systemthat uses Agle lair Indexing as the first stage. In another experinant, the 
HvBE objective function is evaluated on numrous rancbmalignrBnts. Addition- 
ally, the effect of occlusions onFMEare investigated Enally, refinerant of 3D 
alignmnts is demnstrated 

In the followng experinants, inage edge curves vere arbitrarily subdivided into 
f ragmnts for feature extracti on The reccgri ti on experi nants based on these features 
show good performnce, but the performnce right be inproved if a rare stable 
subdi vi si on techri que -vere used 



10.1 2D]^qgptionIkperirar±s 

The experinants described in this section use the EMalgorithmto carry out local 
searches in pose space of the FME objective function This is used for evaluating 
and refining alignmnts that are generated by Agle lair Indexing. Acoarse - fine 
approach i s used i n refiri ng the al i gnmnts produced by Agl e Ri r Indexi ng. T3 th s 
end, tw> sets of features are used, coarse features and fine features. 
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Egure 10-1: Gayscale Inage 
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Egure 10-2: Garse Mdel andlrage Eatures 
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E gure 10-3: E ne Mdel and Inage Eatures 
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The vi deo i rage used for the recogri ti on experi rant appears i n E gure 10- 1. The 
mdel features were deri ved f romMan Kjge Ireges, as described in Section 4. 4. 
The standard deviation of the sroothing that was used in preparing the mdel and 
i rage edge rape was 3. 97 for the coarse features, and 1. 93 for the fine features. The 
edge curves were broken arbitrarily every 20 pixels for the coarse features, and every 
10 pixels for the fine features. Ibint-radius features were fitted to the edge curve 
fragmnts, as described in Section 5. 3. The coarse mdel audi rage features appear 
i n B gure 10-2, the fine mdel and i rage features appear i n H gure 10- 3. There are 81 
coarse mdel features, 334 coarse imge features, 246 fine mdel features, and 1063 
fine imge features. 

The oriented stationary statistics mdel of feature fluctuations was used (this 
is described in Section 3. 3). The paramters (statistics) that appear in the HvBE 
objective function, the background probability and the covariance mtrix for the 
oriented stationary statistics, were derived frommtcbes that were done by hand 
These training mtches were also used in the erprical study of the goodness of 
the norml mdel for feature fluctuations discussed in Section 3.2. 1, and they are 
described there. 

ID. 11 GeectirgAigiHts 

Initial alignmnts were generated using Agle lair Indexing (described in Chapter 9) 
on the coarse features. The angle pair table was constructed wth 80 by 80 cells, and 
sparsification was used - 5 percent of the entries were rancbnhy kept. The distance 
threshold ras set at 50 pixels (the imge size is 640 by 480). The resulting table 
contained 234 entries. Wdi these values, urifornhy generated rancbmangle pairs 
have . 0365 probali 1 i ty of "hitting" in the table. 

Mnentbe imge feature pairs were indexed into the table, 20574 candidate feature 
correspondence pairs were generated This is consi derali y fewer that the 732 ml lion 
possible pairs of correspondences in this situation Egure 10-4 illustrates three of 
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the candidate alignmnts by superimposing the object in the i nages at the pose 
associ ated wth the initial alignnant inpliedby the pairs of feature correspondences. 
The i ndi cated scores are the negati ve of the FMEobj ecti ve f uncti on corputed wth 
the coarse features. 

1Q12 Sbcrirg Lifee: Aigrais 

The initial alignmnts wire evaluated in the followng my. The indexing process 
produces hypotheses consisting of a pair of correspondences fromirage features to 
object features. These pairs of correspondences wre converted into an initial wight 
ratrixfor the EMalgorithm The Mstep of the algorithmwis run, producing a 
rough alignnant pose. The pose wis then evaluated using the Estep of the EM 
algorithm wich corputes the value of the objective function as a side effect (in 
addition to a newestirate of the wiglts). Thus, running one cycle of the EM 
algorithm initialized by the pair of correspondences, generates a rough alignnant, 
and evaluates the EMEobj ecti ve function for that alignnant. 

1Q13 Rfirirg Lifee: Aigrais 

This section illustrates the nathod used to refine indexer alignmnts. 

Egure 10-5 show a closer view of the best scoring initial alignnant fromAgle 
Eai r Indexi ng. The initial alignnant wis refined by running the EMalgor it hrrto con- 
vergence using the coarse features and statistics. The result of this coarse rehhenant 
i s di spl ayed i n E gure 10- 6. The coarse rehhenant wis refined further by runti ng the 
EMalgorithmto convergence wth the fine features and statistics. The result of this 
fine rehhenant i s showi i n E gure 10- 7, and over the vi deo i rage i n E gure 10- 8. 

Gound truth for the pose is available in this experinant, as the true pose is the 
nul 1 pose. The pose before rehhenant i s 

[.99595,-0.0084747,-0.37902,5.0048] T , 
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Egure 10-4: loses and Scores of Sana Indexed rjpotbeses 
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Egure 10-5: Bst Aignmnt fromlndexer, wth (oarse Score 




Egure 10-6: (oarse ftfinenant, wth (oarse Score 
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Egure 10-7: Ere ftfinenant, wthEne Score 




Egure 10-8: Ere Kfinemri wth Vdeo Irage 
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W^c 



Egure 10-9: Correspondences wthWght larger than. 5 
and after the refinemnt it is 

[ 1.00166, 0.0051108, 0.68621, ^.7817] T . 

The encodi ng of these poses is describedinSection5. 3 (the null pose is [1, 0, 0, 0] 
The i ri ti al pose i s i n error by about .01 i n seal e and 5 pi xel s in posi ti on The final 
pose errs by about .005 in scale and 1.8 pixels in position Thus scale accuracy is 
i rpx^ed by a f actor of about tw>, and position accuracy is irp"o\ed by factor of 
about three. Ai experi mnt shoving rare dranati c i rp"ovemnt i s descri bed bel owf 
in Section 10.4. 1. 

In these experi rents, less that 15 iterations of the EMalgorithmrere neededfor 
comer gence. 



1Q14 Erri BVM§ts 
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A discussed in Section 8. 1, a nice aspect of using the Efvfelgorithniwth HvBEis 
that esti rates of feature correspondences are available in the 'weight natrix Bgure 
10-9 displays the correspondences that have wight greater than .5, for the final 
convergence shown in Bgure 10-7. Kre, the inage andmdel features are displayed 
as thin curves, and the correspondences betwen themare shown as heavy lines 
joining the features. N)te the strong si rilarity between these correspondences, and 
those that the systemras trained on, showiin Egure 3-2. 

Table 10.1 displays the values of sons of the weights. The weights sbowbave 
value greater than .01. There are 299 wights this large arong the 413,507 wights. 
The 39 wights sbowiare those belonging to 20 inage features. 



10.2 RoliBtirg RuxbmAignnarts 

Ai experimnt ras perfornad to test the utility of FMEin evaluating rancbnhy 
generated alignmnts. Correspondences arong the coarse features described in Sec- 
tion 10.1 having assignmnts fromtwa inage features to tw> rodel features wre 
rancbnliy generated, and evaluated as in Section 10. 1. 2. 19118 rancbmalignmnts 
wre generated, of -which 1200 had coarse scores better than -30.0 (the negative of 
the E\BE objective function). Armg these 1200, one ras essentially correct. The 
first, second, third, fourth, fifth, and fifteenth best scoring alignmnts are showiin 
Egure 10-10. 

Wh coarse - fine refinenant, the best scoring rancbmal i gnmnt converged to 
the sana pose as the best refinenant fromthe indexing experinant, showiin Egure 
10-7, wth fine score -355. 069. The next best scoring rancbmal ignnant converged to 
a grossly wong pose, wth fine score -149.064. This score provides sona indication 
of the noi se 1 evel i n the fine FMEobj ecti ve f uncti on i n pose space. 

This test, though not exhaustive, produced no false positives, in the sense of a bad 
al i gnmnt wth a coarse score better than that of the correct al i gnmnt . AH ti onal 1 y 
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Inage Index 
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Egure 10-10: RflxbmAignmnts 
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the fine score of the refinemnt of the met pronheing incorrect randomal i gnmnl 
was significantly lover than the fine score of the (correct) refined best alignnant. 

10. 3 Convergence wth Qcl isi on 

The convergence behavior under occlusion of the IMilgorithmwthFMIEwis eval- 
uated usi ng the coarse features descri bed i n Secti on 10. 1. Inages features si mi ati ng 
varying arnunts of occl usi on wre prepared by shifting a varying portion of the i rage. 
These inages, along wth results of coarse - fine refinemnt using the EMalgorithm 
are showi i n E gure 10- 11. 

The starting value for the pose wis the correct (null) pose. The refinemnts 
converged to good poses in all cases, demnstrating that the mtbod can converge 
fromgoodalignmnts wthroderate amunts of occlusion. 

The final fine score in the met occluded exanple is lowr than the noise level 
observed in the experimnt of Section 10.2. Thie indicates that as the amount of 
occlusion increases, a point wll be reached were the mtbod wll fail to produce a 
good pose havi ng a score above the noi se 1 evel . In tli s experi rant i t happens before 
the mtbod fails to converge properly 

10. 4 3DIfecogri t i on Ekperi rant s 
1Q41 Rftirg3D Mgmts 

This secti on demnstrates use of the Efvfelgorithrnwth HvBEto refine alignmnts 
in 3Dreccgrition. The linear corM nation of view mtbod is used to accornadate 
a li riited amunt of out of plane rotation. Atw^viewmriant of K\( described in 
Secti on 5. 5, is used 

A coarse - fine approach wis used (Sarse HvBE scores wre computed by 
smothi ng the HvBEobj ecti ve f uncti on, as descri bedi n Secti on 7. 3. 2. The s mot hi ng 
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Egure 10-11: Ere Gbnvergences wth (delusion 
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Egure 10-12: Gayscale Inage 



ratrix wis 



nAjf(7.07) 



(3.0) 



These numers are the amunts of additional (artificial) variance added for parallel 
and perpendicular deviations, respectively, in the oriented stationary statistics mdel. 

The video test inage is showi in Egure 10-12. It differs fromthe mdel i rages 
by a significant 3Dtranslation and out of plane rotation. The test inage edges are 
showi i n E gure 10- 13. 

The object mdel wis derived fromthe tw> Man Kge Imges showi in Egure 
10- 14. These wre constructed as descri bed i n Secti on 4. 4 

The smotling used in preparation of the edge rape had 1.93 pixels standard 
devi ati on, and the edge curves wre broken arfi trari 1 y every 10 pi xel s . Ri nt- radi us 
features wre fitted to the edge curve fragmnts, as described in Section 5.3, for 
purposes of di spl ay and f or computing the oriented stationary statistics, although the 
features usedwthEVBEandthe EVfelgorithrrrere si nply the J(andycoordi nates 
of the centroids of the curve fragmnts. B)th view of the mdel features are showi 



10.4. 3D REOOCM IT OV EXPERT MMS 



151 



^^r^MMU 




Bgure 10-13: Irege Ezjges 



inBgure 10-15. The linear comi nation of view mt bod requires correspondences 
arang the mdel view. These wre established by band, and are di spl ayed i n E gure 
10-16. 

The rel ati onsli p arong the vi ewpoi nts i n the mdel i nages and the test i rage i s 
illustrated in figure 10-17. This represents the region of the viewspbere containing 
the viewxints. Mte that the test imge is not on the line joining the tw> mdel 
view. 

The oriented stationary statistics mdel of feature fluctuations was used (this is 
described in Sect ion 3. 3). A inSsctionlO. 1, the paramters (statistics) that appear in 
the HvBEobj ecti ve f uncti on, the background probali 1 i ty and the covari ance ratri x 
for the oriented stationary statistics, were derived frommtcbes done by band 

Aset of four correspondences ras established mnually fromtbe imge features 
to the object features. These correspondences are intended to si ni ate analignnant 
generated by an indexing system Indexing systena that are suitable for 3Dreccgri- 
ti on are descri bed by (] enans and Jacobs [ 19] and Jacobs [ 45] . The rough al i gnnant 
and score wre obtained fromtbe correspondences by one cycle of the EX-felgorithm 
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E gure 10- 14: Mdel Man Kjge Irages 
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Egure 10-15: Mdel Eatures (BbthVew) 
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figure 10-16: Mdel Correspondences 
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Test View 



figure 10-17: Mdel and 'Est Inage Vew Bints 
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figure 10-18: Initial Aignmnt 
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Score: -10.842E 






Egure 10-19: (oarse ftfined Aignmnt and (oarse Score 
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Score: -17.2661 



Egure 10-20: Ene ftfined Aignmnt and Erie Score 
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Egure 10-21: Ere ftfined Aignmii wth Vdeo Inage 
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as descri bed above i n Secti on 10. 1. 2. They are d spl ayed i n E gure 10- 18, where the 
four corresponding features appear circled Acoarse alignrant was then obtained 
by running the EMalgorithmto convergence wth smotling, the result appears in 
B gure 10-19. Thi s al i gnnant was refined further by runni ng the EVhl gori thmagai n, 
wthout smothi ng. The resul ti ng al i gnmnt and score are showi i n E gure 10- 20. In 
these figures, the inage features are shown as curve fragnants for clarity, although 
onl y the poi nt 1 ocati ons were used i n the experi rent . The i rage features used are a 
subset taken froma rectangular region of the larger inage. 

Egure 10-21 d splays the final ali gnmnt superi nposed over the original video 
i mge. IVfet of the mdel features have al i gned wsl 1 . The d screpancy i n the f orrard 
wheel well ray be due to inaccuracies in the KVmdeling process, perhaps in the 
feature correspondences. This figure demnstrates goodresults for aligning a smoth 
3Dobject havi ng si x degrees of freecbrrof ration, wthout theuseprivilegedfeatures. 

1Q42 KftirgBrtutedEBes 

This secti on describes an additional demnstrationof local search i n pose space usi ng 
HMEin3D 

The pose correspond ng to the refined ali gnmnt d splayed in Egure 10-20 wis 
perturbed by add ng a d spl acemnt by 4. pi xel s i n Y. Thi s pose wis then refined 
by running the EMalgorithmto convergence. The perturbed ali gnmnt and the 
resulting coarse - fine refinemnt is showi in Egure 10-22. The result is very close 
to the pose prior to perturbation 

Asirilar experi rant wis carried out wth a larger perturbation, 12.0 pixels in 
Y. The results of this appear in Egure 10-23. This tire the convergence is to 
a clearly wong ali gnmnt. The mdel has been stretched to a thin configuration, 
and nhsratcbed to the inage. The resulting fine score is lower than that of the 
good ali gnmnt in Egure 10-21. This illustrates a potential drawback of the linear 
coria nation of view mtbod In addtion to correct view, LCVcan synthesize 
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E gure 10-22: Ml d y Rrturbed Ai gnnant and ftsul ti ng ftfinenant 



10.4. 3D REOOCM IT OV EXPERT MMS 



159 







S_ -o- xr 






I 



^i 



/ n\ 
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Egure 10-23: Rrturbed Aignmnt and Resulting ftfinenant wthEne Score 
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Egure 10-24: Rd Aignmnt and ftsul ting ftfinenant wthEne Score 
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view where the mdel is stretched I£\f as used here, has 8 paranaters, rather 
than the 6 of rigid nation. The tw> extra paranaters deternone the stretching part 
of the transfornation This pxoliemcanbe addressed by checking, or enforcing, a 
quadratic constraint on the paranaters. This is discussed in [71] . 

Aother sirilar experimnt was perforrad starting wth a very bad alignnant. 
The results appear in Hgure 10-24. The algorithmras able to bring sona features 
into alignnant, but the score renainedlow 
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Chapter 11 



Cbncl us i ons 



Vsual object recognition - finding a knowi object in scenes, -where the object is 
smoth, is vie-ved under varying illunhnat ion conditions, has six degrees of freedom 
of position, is subject to occlusions and appears against raryi ng backgrounds -still 
presents problem In this thesis, progress has been mde by applying nathods of 
statistical inference to recognition. Ber- present uncertainties are accomodated 
by statistical characterizations of the recognition problem MP Mdel Mtcling 
(MH)[ and Posterior Mrginal Pose Bti nation (FMU). NtMras showito be 
effective for searcli ng arong feature correspondences andPABEwas shown effective 
for searches i n pose space. The i ssue of acqii ri ng sal i ent obj ect features under varyi ng 
i 1 1 umnati on ras addressed by usi ng Man Kge Imges . 

The alignmnt approach, vti ch 1 everages fast indexing nathods of hypothesis 
generation, is utilized Agle Pair Indexing is introduced as aneffiient 2Dindexing 
nathod that does not depend on extended or sped al features that can be hard to 
detect. Ai extension to the alignmnt approach that my be surnari zed as align 
refine verify is advocated The EMalgorithnis emloyedfor refining the estimte of 
the object's pose while si nAtaneously identifying and incorporating the constraints 
of all supporting imge features. 

Aeas for future research i ncl ude the followng: 
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• Indexing ras rot usedintbe 3Dreccgniti on experi rents. Identifying a sui tali e 
mchanisnfor this purpose that nashes wll wththe type of features usedbere, 
w)ul d be an i nprovenant . 

• To few view were used i n rodel construction. lully aut orating the rodel 
acqii si ti on process , as descri bed i n Chapter 4, and acqii ri ng rodel s f romrore 
view w)uldhelp. 

• Extendi ng the f ornnl ati ons of recogni ti on to hand e rnl ti pi e obj ects i s strai ght- 
forwd, but identifying suitable search strategies is an irportant and non- 
trivial task 

• Incorporati ng non- 1 i near rodel s of proj ecti on i nto the f ornnl ati on w>ul d al 1 ow 
robust perforrance incbrains havi ng seri ous perspective distortions. 

• liingi rage- like tables could speed the evaluation of the FMEobj ecti ve func- 
tion. 

• Investigating the use of FMEin object track ng or in other active vision cb- 
rai ns right prove f ni tf ul . 

Mre w>rk in these areas wll lead to practical and robust object recognition 
systena. 
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Not at i on 

Symbol Maning Defining Section 

Y={Y u Y 2 ,...,Y n } theimge 2.1 

n nunher of inage features 

Yi (ER V i rage feature 2. 1 

M={M i,M 2 , ...,M ro } the obj ect mdel 2.1 

m nunher of object features 

Mj mdel feature, frequently M j eR vxz 2.1 

_l_ the background feature 2.1 

T={r i,r 2 ,...,r„} correspondences 2.1 

Ti gMJ{_L} assignmnt of inage feature i 2.1 

/3Gi? z pose of obj ect 5.1 

1\M v fy projection into inage 5.1 

G^(a) Qussian probability density 3.2 6.1 

ipi j covari ance natri x of feature pai r 3. 3 

ij) stationary feature covari ance natri x 3.3 

ipp covari ance natri x of pose prior 6. 1 

B B i background probabi 1 i ty 2. 2 2. 4 

Wk extent of inage feature dimnsion k 3. 1 

A,- j, A correspondence remrd 6. 1 

x estinate of x 

j(- ) probability (see below) 

Robability notation is somwhat abused in this w>rk, in the interest of brevity 
■ji^x) naystandfor either a probabi litynass function of a discrete variable a; or for a 
probabi 1 i ty densi ty f uncti on of a conti nuous vari abl e x The naani ng wl 1 be cl ear i n 
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context based on the type of the wi ali e argunant . AH ti onal 1 y, nhxed probali 1 i ti es 
are described wth the sara notation Ex example j(T, f3\ Y) stands for the rirxed 
probability function that is a probability mss function of T (the discrete variable 
descri H ng correspondences) , and a probali 1 i ty densi ty f uncti on of /3(the pose sector) 
- both condi ti oned on Y (the i nage feature coordi nates) . 
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